CN110727389A - File cleaning method and system - Google Patents

File cleaning method and system Download PDF

Info

Publication number
CN110727389A
CN110727389A CN201810776134.0A CN201810776134A CN110727389A CN 110727389 A CN110727389 A CN 110727389A CN 201810776134 A CN201810776134 A CN 201810776134A CN 110727389 A CN110727389 A CN 110727389A
Authority
CN
China
Prior art keywords
cleaning
file
target
queue
files
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810776134.0A
Other languages
Chinese (zh)
Other versions
CN110727389B (en
Inventor
陈俊发
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wangsu Science and Technology Co Ltd
Original Assignee
Wangsu Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wangsu Science and Technology Co Ltd filed Critical Wangsu Science and Technology Co Ltd
Priority to CN201810776134.0A priority Critical patent/CN110727389B/en
Publication of CN110727389A publication Critical patent/CN110727389A/en
Application granted granted Critical
Publication of CN110727389B publication Critical patent/CN110727389B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0608Saving storage space on storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0652Erasing, e.g. deleting, data cleaning, moving of data to a wastebasket
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention relates to the technical field of cloud services, and discloses a file cleaning method and a file cleaning system. The file cleaning method comprises the steps of calling a first thread to inquire to obtain a target cleaning file, and adding the target cleaning file to a cleaning queue; and calling a plurality of second threads to delete the target cleaning files in the cleaning queue in sequence. By adopting the embodiment of the invention, the thread blockage can be avoided and the file cleaning efficiency can be improved.

Description

File cleaning method and system
Technical Field
The embodiment of the invention relates to the technical field of cloud services, in particular to a file cleaning method and a file cleaning system.
Background
With the continuous increase of network bandwidth, the application of cloud services in daily life has become quite popular. Cloud storage is a novel network storage form, and is popular with enterprises and individual users. The cloud storage is generally charged through storage capacity, for a file which does not need to be stored permanently, a user can set an expiration time for the file, the file is regarded as an expired file after the set expiration time is reached, and the cloud storage server needs to clean the expired file in time to release storage space.
The inventor finds that at least the following problems exist in the prior art: in the prior art, after a batch of expired files are queried in the same thread, a deletion command is executed through the thread, which may cause thread blocking, that is, deletion of a later expired file needs to be performed after deletion of a previous expired file is completed.
Disclosure of Invention
The embodiment of the invention aims to provide a file cleaning method and a file cleaning system, which can avoid thread blockage and improve the file cleaning efficiency.
In order to solve the above technical problem, an embodiment of the present invention provides a file cleaning method, including: calling a first thread to query to obtain a target cleaning file, and adding the target cleaning file to a cleaning queue; and calling a plurality of second threads to delete the target cleaning files in the cleaning queue in sequence.
The embodiment of the invention also provides a file cleaning system which comprises N servers, wherein N is a natural number which is more than or equal to 2; the server comprises at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the file cleaning method as described above.
Compared with the prior art, the method and the device have the advantages that the target cleaning files are obtained through the first thread query, the target cleaning files are added to the cleaning queue, and the target cleaning files in the cleaning queue are cleaned through the second threads, so that the outdated files are cleaned. Therefore, the second threads can read and clean the target cleaning file in the cleaning queue respectively, so that the file cleaning speed can be greatly increased, and the cleaning efficiency is improved.
In addition, whether the cleaning condition of the target cleaning file in the cleaning queue meets a dormancy condition is judged, and if the cleaning condition meets the dormancy condition, the first thread is controlled to be dormant for a preset dormancy duration. By controlling the first thread to sleep, the accumulation of the target cleaning files in the cleaning queue can be prevented.
In addition, the determining whether the cleaning condition of the target cleaning file in the cleaning queue meets the hibernation condition specifically includes: when the increasing speed of the target cleaning files in the cleaning queue is greater than a preset increasing threshold value, judging that the cleaning condition of the target cleaning files in the cleaning queue meets the dormancy condition, or when the number of the target cleaning files in the cleaning queue is greater than a preset dormancy number, judging that the cleaning condition of the target cleaning files in the cleaning queue meets the dormancy condition, and therefore writing overflow of the cleaning queue can be prevented.
In addition, the file cleaning method further comprises the following steps: pre-starting a preset number of servers, and executing, by the started servers: calling a first thread to query to obtain a target cleaning file, and adding the target cleaning file to a cleaning queue; calling a plurality of second threads to delete the target cleaning files in the cleaning queue in sequence; when the cleaning condition of the target cleaning file in the cleaning queue meets the capacity expansion condition, the capacity expansion is carried out on the started server; and when the cleaning condition of the target cleaning file in the cleaning queue meets the capacity reduction condition, carrying out capacity reduction on the started server. The file is cleaned in a dynamic capacity expansion and dynamic capacity reduction mode, so that not only can the performance bottleneck be avoided, but also the resource waste can be prevented.
In addition, the expanding the enabled server specifically includes: judging whether the first thread inquires the target cleaning file according to a concurrent mode, and if the first thread inquires the target cleaning file according to the concurrent mode, expanding the capacity of the started server according to the capacity expansion ratio; and if the target cleaning files are not inquired in the concurrent mode, expanding the capacity according to the number of the target cleaning files in the cleaning queue and the preset cleaning duration. Therefore, the capacity expansion can meet the requirements of different service scenes such as concurrent query and non-concurrent (also called discrete) query, and the capacity expansion is more matched with the actual requirements.
In addition, the capacity expansion ratio is determined according to the number of the data tables which are inquired concurrently. Therefore, the capacity expansion quantity in the concurrent scene is more suitable for the actual requirement.
In addition, the capacity reduction of the enabled server specifically includes: when the first thread queries the target cleaning files in a concurrent query mode, determining a recovery quantity range according to the corresponding relation between the quantity of the target cleaning files in the cleaning queue and the recovery quantity range, and recovering the server according to the determined recovery quantity range; the corresponding relation comprises the number of the multiple groups of target cleaning files and the range of the recycling number; and when the first thread does not inquire the target cleaning file according to the concurrent mode and the preset capacity reduction time length is reached, reducing the capacity of the started servers to the preset number. Therefore, the capacity reduction can meet the requirements of different service scenes such as concurrent query and non-concurrent query, and the capacity reduction is more matched with the actual requirements.
In addition, when the growth speed of the target cleaning file in the cleaning queue is greater than a capacity expansion growth threshold value, judging that the cleaning condition of the target cleaning file in the cleaning queue meets a capacity expansion condition; and when the reduction speed of the target cleaning file in the cleaning queue is greater than a reduction threshold, judging that the cleaning condition of the target cleaning file in the cleaning queue meets a capacity reduction condition. Thereby effectively meeting the requirements of capacity expansion and capacity reduction.
Drawings
One or more embodiments are illustrated by way of example in the accompanying drawings, which correspond to the figures in which like reference numerals refer to similar elements and which are not to scale unless otherwise specified.
FIG. 1 is a flowchart of a file cleaning method according to a first embodiment of the present invention;
FIG. 2 is a flowchart of a file cleaning method according to a second embodiment of the present invention;
fig. 3 is a schematic structural diagram of a file cleaning system according to a third embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, embodiments of the present invention will be described in detail below with reference to the accompanying drawings. However, it will be appreciated by those of ordinary skill in the art that numerous technical details are set forth in order to provide a better understanding of the present application in various embodiments of the present invention. However, the technical solution claimed in the present application can be implemented without these technical details and various changes and modifications based on the following embodiments.
The first embodiment of the invention relates to a file cleaning method, which can be applied to one server or a server group (which can be called as a file cleaning server), wherein the number of the file cleaning servers can be configured according to actual needs, when a single server can meet the file cleaning requirement, a single file server can be started, and when the number of expired files is very large, a plurality of file cleaning servers can also be started, so that the file cleaning requirement is met. Referring to fig. 1, the file cleaning method includes steps 101 and 102.
Step 101: and calling a first thread to query to obtain a target cleaning file, and adding the target cleaning file to a cleaning queue.
Step 102: and calling a plurality of second threads to delete the target cleaning files in the cleaning queue in sequence.
Compared with the prior art, the method and the device have the advantages that the target cleaning files are obtained through the first thread query, the target cleaning files are added to the cleaning queue, and the target cleaning files in the cleaning queue are cleaned through the second threads, so that the outdated file cleaning is realized. Therefore, the second threads can read and clean the target cleaning file in the cleaning queue respectively, so that the file cleaning speed can be greatly increased, and the cleaning efficiency is improved.
The following describes implementation details of the file cleaning method according to the present embodiment in detail, and the following is only provided for easy understanding and is not necessary for implementing the present embodiment.
Specifically, the target cleaning file refers to an expired file in the cloud storage server, but is not limited thereto, and may be other files that need to be cleaned. The file system generally includes file data and Metadata (Metadata), where the file data refers to actual data in a general file, and the Metadata is also called intermediary data and relay data, and is data (data aboutdata) describing data, mainly information describing data attributes, and is used to support functions such as indicating storage location, history data, resource search, access authority, file record, and the like. The metadata may include information on the file's attributes, size, creation time, access time, owner group, etc. The metadata can be stored in the mongo database, the expiration time of the file can be marked through an attribute, and the file needs to be cleaned after the set expiration time point is reached. The cloud storage server stores a large number of files, and the files may expire at every time point, so that the expired files can be checked and cleaned regularly.
When cleaning up the expired files, the server needs to run corresponding processes and threads. The process is a basic execution entity of the program, and is a description of instructions, data and an organization form thereof. The thread is a single sequential control flow in the program, is a relatively independent and schedulable execution unit in the process, and is a basic unit for independently scheduling and allocating the CPU and a scheduling unit of the running program by the system. Specifically, the file clean server may create the first thread by calling the pthread _ create function. The first thread can be specially used for inquiring the target cleaning file in the database, the number of the first thread can be one, when a plurality of file cleaning servers are started, a certain file cleaning server can be appointed to start the first thread, the target cleaning file retrieved by the first thread can be added to a cleaning queue, and particularly, metadata of the target cleaning file can be cached to the cleaning queue, so that the second thread can conveniently acquire the target cleaning file. The cleaning queue is, for example, any one of Redis, RabbitMQ, ActiveMQ, but is not limited thereto. Redis is used as a shared queue, which can be accessed by a plurality of file cleaning servers to obtain a target cleaning file, and certainly, cleaning queues can be respectively created on the plurality of file cleaning servers, and a first thread can respectively write the target cleaning file into the cleaning queues on the file cleaning servers. The first thread may be periodically enabled by a timer control. For example, the first thread is enabled every 30 minutes or 1 hour, and the enabling period of the first thread is not particularly limited in the present embodiment. And the first thread can quit after the database query is finished.
The second thread may be dedicated to delete the target cleaning file queried by the first thread, where the number of the second threads is multiple, specifically, a thread pool may be created in advance, at least a part of the target cleaning file in the cleaning queue is read into the thread pool, multiple second threads are enabled in the thread pool, and the target cleaning file in the thread pool is cleaned by the multiple second threads, but not limited thereto, multiple second threads may also be created by other manners. The creation of threads and thread pools is known to those skilled in the art and will not be described herein. It should be noted that each second thread needs to delete the bottom file (i.e., actual data) of the target cleaning file in the cloud storage server first, and delete the metadata of the file from the database after the deletion is successful.
In some cases, the file cleaning speed of the second thread may lag behind the query speed of the target cleaning file of the first thread, which leads to the continuous increase of the number of the target cleaning files in the cleaning queue, so that the first thread and the second thread can be more matched by controlling the dormancy of the first thread, and the target cleaning files in the cleaning queue are prevented from being accumulated too much or overflowing. Optionally, whether the cleaning condition of the target cleaning file in the cleaning queue meets the sleep condition or not may be determined, and if the cleaning condition is met, the first thread is controlled to sleep for a preset sleep duration. The cleaning condition of the target cleaning file can be represented by the following modes: the number of target cleaning files in the cleaning queue is increased or decreased, and the speed is increased or decreased. In practical application, when the increase speed of the target cleaning file in the cleaning queue is greater than a preset increase threshold, it may be determined that the cleaning condition of the target cleaning file in the cleaning queue meets the sleep condition, and at this time, the first thread is controlled to sleep for a preset sleep duration, where the preset duration is, for example, any value in 2 to 5 seconds. Specifically, the preset increase threshold may be represented by, for example, the number of target cleaning files increased in a unit time, for example, 4 target cleaning files are increased in 5 seconds as a limit, if the number of target cleaning files exceeds 4, the first thread is controlled to sleep, and if the number of target cleaning files is less than or equal to 4, the first thread is controlled to continuously operate.
In practical application, when the number of the target cleaning files in the cleaning queue is judged to be larger than the preset dormancy number, the cleaning condition of the target cleaning files in the cleaning queue is judged to meet the dormancy condition. Specifically, the preset number of hibernation is, for example, a larger or maximum number of target cleaning files that can be stored in the cleaning queue. Therefore, when the number of the target cleaning files in the cleaning queue is judged to be more or reaches the maximum, the first thread is controlled to sleep, so that the processing speeds of the first thread and the second thread can be balanced, and the cleaning queue is prevented from being accumulated or overflowing. The present embodiment is not particularly limited in the manner of determining the cleaning status of the target cleaning file.
Compared with the prior art, the method and the device have the advantages that query and cleaning tasks of the target cleaning file are distributed to different threads, file cleaning is performed through the threads, thread blocking can be avoided, file cleaning efficiency is improved, processing progress of the query thread and processing progress of the delete thread (namely the second thread) are matched in a mode of controlling the query thread (namely the first thread) to sleep, accumulation or overflow of a cleaning queue can be effectively prevented, and overdue files are prevented from being omitted.
A second embodiment of the present invention relates to a file cleaning method. The second embodiment is an improvement on the first embodiment, and the main improvement lies in that: in the second embodiment, the file is cleared by dynamic capacity expansion and dynamic capacity reduction, so that performance bottleneck and resource waste can be avoided.
Referring to fig. 2, the file cleaning method of the present embodiment includes steps 201 to 204.
Step 201: a preset number of servers are enabled in advance.
The number of expired files generated at different points in time may be different, and thus, the preset number may refer to the number of servers that can meet the cleaning requirements of an average number of expired files. Specifically, the preset number may be obtained through statistics, for example, by counting an average number of servers required for performing file cleaning in a period of time, and taking the average number as the preset number, and the setting manner of the preset number is not particularly limited in this embodiment.
Step 202: and executing, by the enabled server: calling a first thread to query to obtain a target cleaning file, and adding the target cleaning file to a cleaning queue; and calling a plurality of second threads to delete the target cleaning files in the cleaning queue in sequence.
Step 202 is similar to steps 101 and 102 of the first embodiment, and is not described here again.
Step 203: and when the cleaning condition of the target cleaning file in the cleaning queue meets the capacity expansion condition, expanding the capacity of the started server.
In step 203, it may be determined whether the cleaning status of the target cleaning file in the cleaning queue meets the capacity expansion condition in the following manner: when the growth speed of the target cleaning files in the cleaning queue is larger than the capacity expansion growth threshold value, the cleaning condition of the target cleaning files in the cleaning queue is judged to meet the capacity expansion condition, and when the growth speed of the target cleaning files in the cleaning queue is smaller than or equal to the capacity expansion growth threshold value, the cleaning condition of the target cleaning files in the cleaning queue is judged to not meet the capacity expansion condition. But is not limited thereto.
The capacity expansion increase threshold is used to reflect a matching degree between the cleaning speed of the enabled server and the query speed of the first process, and the capacity expansion increase threshold may be set in the same manner as or different from the preset increase threshold in the first embodiment. The capacity expansion increase threshold may be similarly expressed by the number of target cleaning files cleaned in a unit time, for example, the number of target cleaning files cleaned in 5 seconds is limited, if the number of target cleaning files increased in 5 seconds exceeds 4, capacity expansion is performed on the enabled servers, and if the number of target cleaning files increased in 5 seconds is less than or equal to 4, the number of enabled servers is maintained unchanged. In the present embodiment, the determination manner of whether the cleaning condition satisfies the capacity expansion condition and the setting manner of the capacity expansion increase threshold are not particularly limited.
In step 203, the expanding the capacity of the enabled server specifically includes: and judging whether the first thread inquires the target cleaning file in a concurrent mode, if so, expanding the capacity of the started server according to the capacity expansion proportion, and if not, expanding the capacity according to the number of the target cleaning files in the cleaning queue and the preset cleaning duration.
Wherein the first thread queries for expired files through a database storing metadata for the files. The database may store information in the form of data tables. The concurrent mode refers to that the first thread queries a plurality of data tables at the same time, and at the moment, the number of the first threads can be multiple, so that the concurrent query requirement is met. The range of the concurrent query may be all data spaces in the entire database, for example, all data tables, or multiple data tables with different numbers, and the size of the data space of the concurrent query (also referred to as space granularity) is not particularly limited in this embodiment. Corresponding to the concurrent direction, the query is a discrete query, that is, a query mode of specifying spatial granularity, for example, a data table is specified for query each time. The number of the searched target cleaning files can grow faster due to the faster searching speed of the concurrent searching. Discrete query since a specific spatial granularity is specified for query, the number of expired files queried in different spatial granularities may be different greatly.
The capacity expansion ratio may be a ratio of a preset number of servers based on pre-activation, or may be a ratio of servers already activated, and the cardinality corresponding to the capacity expansion ratio is not particularly limited. During concurrent query, the capacity expansion ratio may be determined according to the ratio or the number of the data tables subjected to concurrent query, for example, when the number of the data tables subjected to concurrent query is 20% of all the data tables, the capacity expansion ratio is 1.2 times, and when the percentage of the data tables subjected to concurrent query is increased, the capacity expansion ratio is increased. The expansion ratio is maximum when the concurrency mode is an ultra-large quantity (all data spaces of the whole database) concurrency, and is 3 for example.
When the target cleaning files are inquired in a discrete mode, capacity expansion can be carried out according to the number of the target cleaning files in the cleaning queue and the preset cleaning duration. Specifically, an estimated cleaning time required for the enabled server to clean the target cleaning file in the cleaning queue may be calculated, for example, the estimated cleaning time may be calculated as follows: the method comprises the steps of reading the number of target cleaning files, the size of each target cleaning file, the type of a storage medium, the number of enabled servers and the like, and calculating to obtain a predicted cleaning time, wherein the number of servers is multiplied by the time when required is obtained according to the number of the target cleaning files, the size of each target cleaning file and the type of the storage medium, the quotient of the required number of servers and the enabled servers is calculated to obtain the predicted cleaning time, the predicted cleaning time is 1 hour for example, the preset cleaning time is 0.5 hour for example, if the servers need to be expanded, the quotient of the required number of servers and the preset cleaning time is calculated to obtain the required total number of servers, the number of enabled servers is subtracted from the total number of servers to obtain the number of servers needing to be expanded, and if the performance of the servers is approximately the same, when the predicted cleaning time is 2 times of the preset cleaning time, it is necessary to expand the number of servers that have been enabled by one time. However, without being limited thereto, a certain number of servers, for example, 2 servers, may be added each time when the server needs to be expanded.
Step 204: and when the cleaning condition of the target cleaning file in the cleaning queue meets the capacity reduction condition, carrying out capacity reduction on the started server.
In step 204, it can be determined whether the cleaning status of the target cleaning file in the cleaning queue meets the capacity reduction condition in the following manner: when the reduction speed of the target cleaning file in the cleaning queue is greater than the reduction threshold, it is determined that the cleaning condition of the target cleaning file in the cleaning queue satisfies the capacity reduction condition, and when the reduction speed of the target cleaning file in the cleaning queue is less than or equal to the reduction threshold, it is determined that the cleaning condition of the target cleaning file in the cleaning queue does not satisfy the capacity reduction condition, but is not limited thereto.
The threshold is decreased to reflect the matching degree between the cleaning speed of the enabled server and the query speed of the first process, and the setting mode of decreasing the threshold may refer to the setting mode of increasing the threshold preset in the first embodiment, except that the threshold is decreased as a threshold or a range of the threshold that needs to be reduced when the cleaning speed is greater than the query speed.
In step 204, the capacity reduction of the enabled server specifically includes: when the first thread queries the target cleaning files according to the concurrent query mode, determining a recovery quantity range according to the corresponding relation between the quantity of the target cleaning files in the cleaning queue and the recovery quantity range, and recovering the server according to the determined recovery quantity range. The corresponding relation comprises the number of the multiple groups of target cleaning files and the recovery number range. And when the first thread does not query the target cleaning file in a concurrent manner and the preset capacity reduction time is reached, reducing the capacity of the started servers to the preset number, but not limited to the preset number.
The quantity of the target cleaning files in the cleaning queue can reach the magnitude order of millions, when the target cleaning files are recovered in a concurrent mode, the quantity of the target cleaning files can be divided into a plurality of levels, for example, 5 levels, and 5 recovery quantity ranges are correspondingly set. The number of the recovered products is, for example, 1 to 2, 2 to 3, 3 to 4, 4 to 5, 5 to 6. Therefore, the number of the target cleaning files in the cleaning queue can be periodically monitored, and the recovery number range corresponding to the number of the target cleaning files at different time points is selected for recovering the server. The correspondence relation can be preset in a comparison table, a plurality of different target cleaning file grades can be set in the comparison table, and the recovery quantity ranges corresponding to the target cleaning file grades can be the same or different. The recovery quantity ranges corresponding to different target cleaning file levels can also be calculated according to the reduction speed of the target cleaning files. Specifically, the recovery number of the server may be calculated in such a manner that the deletion speed of the target cleaning file obtained by subtracting the recovery number range from the number of the servers that have been activated reaches a set value, but is not limited thereto. The server recovery is carried out in a table look-up mode, the recovery quantity can be very conveniently determined, complex calculation is not needed, and the resource recovery can be effectively realized.
When the recovery is carried out in a discrete mode, the preset capacity reduction duration can be preset, and when the preset capacity reduction duration is reached, the started servers are directly reduced to the preset number. The preset capacity reduction time can be set by referring to the preset cleaning time. The present embodiment does not specifically limit the preset size reduction duration and the number of servers after size reduction.
Step 203 and step 204 may be executed in parallel or sequentially, and the execution manner of dynamic capacity expansion and dynamic capacity reduction is not particularly limited in this embodiment.
It should be noted that dynamic capacity expansion or dynamic recovery may also be performed with reference to other conditions. For example, when a certain cloud storage space is urgently needed to be cleared, the server reclamation strategy can be temporarily disabled. When no additional condition is provided, the recycling can be performed by using a small number of servers.
Compared with the foregoing embodiments, the present embodiment can adopt more appropriate dynamic capacity expansion and dynamic capacity reduction modes for different deletion service types such as concurrent and discrete, thereby not only avoiding performance bottleneck, but also effectively preventing resource waste.
The steps of the above methods are divided for clarity, and the implementation may be combined into one step or split some steps, and the steps are divided into multiple steps, so long as the same logical relationship is included, which are all within the protection scope of the present patent; it is within the scope of the patent to add insignificant modifications to the algorithms or processes or to introduce insignificant design changes to the core design without changing the algorithms or processes.
A third embodiment of the present invention relates to a file cleaning system, please refer to fig. 3, where the file cleaning system includes N servers 1, N is a natural number greater than or equal to 2, each server 1 is connected to a server group 2, the server group 2 is used to provide a cloud storage service, the servers 1 are communicatively connected to each other, a first thread and a cleaning queue may be implemented on one of the servers 1, and a second thread may be implemented on each of the servers 1. Each server 1 comprises at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being executable by the at least one processor to enable the at least one processor to perform the file cleaning method according to the first or second embodiment.
Where the memory and processor are connected by a bus, the bus may comprise any number of interconnected buses and bridges, the buses connecting together one or more of the various circuits of the processor and the memory. The bus may also connect various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. A bus interface provides an interface between the bus and the transceiver. The transceiver may be one element or a plurality of elements, such as a plurality of receivers and transmitters, providing a means for communicating with various other apparatus over a transmission medium. The data processed by the processor is transmitted over a wireless medium via an antenna, which further receives the data and transmits the data to the processor.
The processor is responsible for managing the bus and general processing and may also provide various functions including timing, peripheral interfaces, voltage regulation, power management, and other control functions. And the memory may be used to store data used by the processor in performing operations.
Compared with the prior art, the method and the device have the advantages that query and cleaning tasks of the target cleaning file are distributed to different threads, file cleaning is performed through the threads, thread blocking can be avoided, file cleaning efficiency is improved, processing progress of the query thread and processing progress of the delete thread (namely the second thread) are matched in a mode of controlling the query thread (namely the first thread) to sleep, accumulation or overflow of a cleaning queue can be effectively prevented, and overdue files are prevented from being omitted.
It will be understood by those of ordinary skill in the art that the foregoing embodiments are specific examples for carrying out the invention, and that various changes in form and details may be made therein without departing from the spirit and scope of the invention in practice.

Claims (10)

1. A file cleaning method, comprising:
calling a first thread to query to obtain a target cleaning file, and adding the target cleaning file to a cleaning queue;
and calling a plurality of second threads to delete the target cleaning files in the cleaning queue in sequence.
2. The file cleaning method according to claim 1, further comprising:
judging whether the cleaning condition of the target cleaning file in the cleaning queue meets a dormancy condition, and if the cleaning condition meets the dormancy condition, controlling the first thread to be dormant for a preset dormancy time.
3. The method according to claim 2, wherein the determining whether the cleaning status of the target cleaning file in the cleaning queue meets a hibernation condition specifically comprises:
when the increasing speed of the target cleaning files in the cleaning queue is greater than a preset increasing threshold value, judging that the cleaning condition of the target cleaning files in the cleaning queue meets the dormancy condition, or
And when the number of the target cleaning files in the cleaning queue is larger than the preset dormancy number, judging that the cleaning condition of the target cleaning files in the cleaning queue meets the dormancy condition.
4. The document cleaning method according to any one of claims 1 to 3, further comprising:
pre-starting a preset number of servers, and executing, by the started servers:
calling a first thread to query to obtain a target cleaning file, and adding the target cleaning file to a cleaning queue;
calling a plurality of second threads to delete the target cleaning files in the cleaning queue in sequence;
when the cleaning condition of the target cleaning file in the cleaning queue meets the capacity expansion condition, the capacity expansion is carried out on the started server; and
and when the cleaning condition of the target cleaning file in the cleaning queue meets the capacity reduction condition, carrying out capacity reduction on the started server.
5. The file cleaning method according to claim 4, wherein the expanding the enabled server specifically includes:
judging whether the first thread inquires the target cleaning file according to a concurrent mode, and if the first thread inquires the target cleaning file according to the concurrent mode, expanding the capacity of the started server according to the capacity expansion ratio;
and if the target cleaning files are not inquired in the concurrent mode, expanding the capacity according to the number of the target cleaning files in the cleaning queue and the preset cleaning duration.
6. The method according to claim 5, wherein the expansion ratio is determined according to the number of the data tables to be concurrently queried.
7. The file cleaning method according to claim 5, wherein the capacity reduction of the enabled server specifically includes:
when the first thread queries the target cleaning files in a concurrent query mode, determining a recovery quantity range according to the corresponding relation between the quantity of the target cleaning files in the cleaning queue and the recovery quantity range, and recovering the server according to the determined recovery quantity range; the corresponding relation comprises the number of the multiple groups of target cleaning files and the range of the recycling number;
and when the first thread does not inquire the target cleaning file according to the concurrent mode and the preset capacity reduction time length is reached, reducing the capacity of the started servers to the preset number.
8. The file cleaning method according to claim 4,
when the increase speed of the target cleaning files in the cleaning queue is greater than an expansion increase threshold value, judging that the cleaning condition of the target cleaning files in the cleaning queue meets an expansion condition;
and when the reduction speed of the target cleaning file in the cleaning queue is greater than a reduction threshold, judging that the cleaning condition of the target cleaning file in the cleaning queue meets a capacity reduction condition.
9. The file cleaning method according to claim 1, wherein the calling the plurality of second threads to delete the target cleaning file in the cleaning queue in sequence specifically comprises:
and pre-creating a thread pool, reading at least part of the target cleaning files in the cleaning queue into the thread pool, starting a plurality of second threads in the thread pool, and cleaning the target cleaning files in the thread pool through the second threads respectively.
10. A file cleaning system is characterized by comprising N servers, wherein N is a natural number greater than or equal to 2;
the server comprises at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the file cleaning method of any one of claims 1 to 9.
CN201810776134.0A 2018-07-16 2018-07-16 File cleaning method and system Active CN110727389B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810776134.0A CN110727389B (en) 2018-07-16 2018-07-16 File cleaning method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810776134.0A CN110727389B (en) 2018-07-16 2018-07-16 File cleaning method and system

Publications (2)

Publication Number Publication Date
CN110727389A true CN110727389A (en) 2020-01-24
CN110727389B CN110727389B (en) 2023-10-20

Family

ID=69217252

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810776134.0A Active CN110727389B (en) 2018-07-16 2018-07-16 File cleaning method and system

Country Status (1)

Country Link
CN (1) CN110727389B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090055399A1 (en) * 2007-08-21 2009-02-26 Qichu Lu Systems and methods for reading objects in a file system
US7526620B1 (en) * 2004-12-14 2009-04-28 Netapp, Inc. Disk sanitization in an active file system
CN105653635A (en) * 2015-12-25 2016-06-08 北京奇虎科技有限公司 Database management method and apparatus
CN106446155A (en) * 2016-09-22 2017-02-22 北京百度网讯科技有限公司 Method and device for cleansingdata in cloud storage system
CN106775990A (en) * 2016-12-31 2017-05-31 中国移动通信集团江苏有限公司 Request scheduling method and device
CN107066604A (en) * 2017-04-25 2017-08-18 努比亚技术有限公司 A kind of cleaning garbage files method and terminal

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7526620B1 (en) * 2004-12-14 2009-04-28 Netapp, Inc. Disk sanitization in an active file system
US20090055399A1 (en) * 2007-08-21 2009-02-26 Qichu Lu Systems and methods for reading objects in a file system
CN105653635A (en) * 2015-12-25 2016-06-08 北京奇虎科技有限公司 Database management method and apparatus
CN106446155A (en) * 2016-09-22 2017-02-22 北京百度网讯科技有限公司 Method and device for cleansingdata in cloud storage system
CN106775990A (en) * 2016-12-31 2017-05-31 中国移动通信集团江苏有限公司 Request scheduling method and device
CN107066604A (en) * 2017-04-25 2017-08-18 努比亚技术有限公司 A kind of cleaning garbage files method and terminal

Also Published As

Publication number Publication date
CN110727389B (en) 2023-10-20

Similar Documents

Publication Publication Date Title
CN109783218B (en) Kubernetes container cluster-based time-associated container scheduling method
US8818989B2 (en) Memory usage query governor
CA2785398C (en) Managing queries
US6654766B1 (en) System and method for caching sets of objects
US8583608B2 (en) Maximum allowable runtime query governor
CN109271435B (en) Data extraction method and system supporting breakpoint continuous transmission
US20150295970A1 (en) Method and device for augmenting and releasing capacity of computing resources in real-time stream computing system
US8909614B2 (en) Data access location selecting system, method, and program
US8966493B1 (en) Managing execution of multiple requests in a job using overall deadline for the job
US20200311026A1 (en) File processing method and server
CN111522636A (en) Application container adjusting method, application container adjusting system, computer readable medium and terminal device
CN107818012B (en) Data processing method and device and electronic equipment
US20160291672A1 (en) Preformance state aware thread scheduling
CN112363812B (en) Database connection queue management method based on task classification and storage medium
CN108446169B (en) Job scheduling method and device
CN110727389A (en) File cleaning method and system
JP5692355B2 (en) Computer system, control system, control method and control program
CN110955502B (en) Task scheduling method and device
CN109582460B (en) Redis memory data elimination method and device
CN114374652B (en) Data transmission speed limiting method and device between thermomagnetic storage and blue light storage
Harrison et al. Energy--performance trade-offs via the ep queue
CN111176848B (en) Cluster task processing method, device, equipment and storage medium
CN111506256B (en) Method for reducing write performance variation and preventing IO blocking
CN114258532A (en) Apparatus and method for merging backup policies
US20220092070A1 (en) Optimal query scheduling for resource utilization optimization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant