CN112000626A

CN112000626A - File processing method and system for file server

Info

Publication number: CN112000626A
Application number: CN202010819775.7A
Authority: CN
Inventors: 牟洪洋; 张芳
Original assignee: Suzhou Inspur Intelligent Technology Co Ltd
Current assignee: Suzhou Inspur Intelligent Technology Co Ltd
Priority date: 2020-08-14
Filing date: 2020-08-14
Publication date: 2020-11-27
Anticipated expiration: 2040-08-14
Also published as: CN112000626B

Abstract

The invention relates to the technical field of servers, and provides a file server file processing method and a file server file processing system, wherein the method comprises the following steps: scanning files of the file server cluster at regular time, and respectively scoring the files and the servers according to the scanning result to obtain a file score value and a server pressure value; combining the obtained file score value and the server pressure value, processing abnormal files of the file server in a man-machine interaction mode, wherein the abnormal files comprise garbage residual files and large files with low use frequency, so that the garbage files which are not used frequently and have large capacity on the file server are processed, the use efficiency of the file server is improved, and the cost for purchasing the file server is saved.

Description

File processing method and system for file server

Technical Field

The invention belongs to the technical field of servers, and particularly relates to a file server file processing method and system.

Background

The file server is an operation environment and is used for storing files such as software, documents, pictures, audio and video types generated by the operation of each service system, the file server is standard configuration of the operation of each service, companies operating various services have an independent file server, the operation speed of each service system is increased by using the file server, the reusability of the files is improved, and the utilization rate of the files is increased.

However, a file server also causes a certain problem, and at present, file servers of various companies do not only support one service, but generally use a file server cluster for a plurality of services at the same time, which causes that a service is offline but all files are not found when the files are deleted, so that the files are left; or the uploaded file reports an error, and although the business system solves the problem, the previous file is not found so as to generate a file residue.

Disclosure of Invention

Aiming at the defects in the prior art, the invention provides a file server file processing method which can determine whether to automatically delete junk files according to the runtime environment of a file server and the use frequency of files.

The technical scheme provided by the invention is as follows: a file server file processing method, the method comprising the steps of:

scanning files of the file server cluster at regular time, and respectively scoring the files and the servers according to the scanning result to obtain a file score value and a server pressure value;

and processing abnormal files of the file server in a man-machine interaction mode by combining the obtained file score value and the server pressure value, wherein the abnormal files comprise garbage residual files and large files with low use frequency.

As an improved scheme, the step of scanning the files of the file server cluster at regular time, and scoring the files and the servers respectively according to the scanning result to obtain the file score value and the server pressure value specifically includes the following steps:

receiving root directory information and scanning action setting information input by a user, wherein the scanning action setting information comprises scanning time and a scanning period;

scanning all files of a file server according to the received root directory information and scanning action setting information to obtain basic file information and basic server information, wherein the basic file information comprises the name, the size and the modification time of the file and the file server where the file is located, and the basic server information comprises the storage utilization rate and the storage growth rate of the server and the state information of a memory and a CPU (Central processing Unit);

according to the obtained basic information of the files, scoring is carried out on all the files of the file server, and the file score value of each file is obtained;

and scoring the file server according to the acquired basic server information to acquire a server pressure value.

As an improved scheme, a calculation formula for scoring all files of the file server is as follows:

G＝e^-(x-10)+y+z

wherein G is the file score value, x is the total file reference times, y is the time difference between the server time and the file modification time during scanning, and is given by day, z is the file size, and G is given by unit.

As an improved scheme, a calculation formula for scoring the file server is as follows:

V＝40*x₁+40*x₂+10*x₃+10*x₄

wherein V is a server pressure value, x₁The representation is the storage usage, x, of the server₂Represented by the ratio of the amount of memory growth to the total capacity, x₃Representative is the average CPU utilization, x, during the last and next scans₄Representing the average memory utilization during the last scan and the period of the scan.

As an improved scheme, the step of processing the abnormal file of the file server in a man-machine interaction manner by combining the obtained file score value and the server pressure value specifically includes the following steps:

comparing the obtained server pressure value with a preset pressure value controllable range area, and judging whether the server pressure value is in the controllable range area;

when the server pressure value is judged to be in the pressure value controllable range area, scanning files of the file server cluster at regular time, and respectively scoring the files and the servers according to the scanning result to obtain a file score value and a server pressure value;

when the server pressure value is judged to exceed the pressure value controllable range area, judging whether the server pressure value is in a pressure value serious range area;

when the server pressure value is judged to be in the severe range of the pressure value, all files are sorted according to the score value, the sorted file information and the corresponding files are fed back to the user, and the file deleting operation action executed by the user is received;

and when the server pressure value is judged to exceed the pressure value severe range area, performing file transfer-based high-availability processing on the file servers of the server cluster.

As an improved solution, when it is determined that the server pressure value exceeds the pressure value severe range area, the step of performing file transfer-based high-availability processing on the file servers of the server cluster specifically includes the following steps:

when the server pressure value exceeds the severe range area of the pressure value, sorting all files according to score values, feeding back the sorted file information and the corresponding files to a user, and receiving a file deleting operation action executed by the user;

while feeding back the sorted file information and the corresponding files to the user, analyzing the available space of the file servers in the server cluster to obtain the remaining available space of each file server;

selecting N files from a plurality of file servers, and simultaneously selecting M files with the top scores, wherein N and M are natural numbers, and M is more than or equal to N;

equally dividing the M files into N parts, averagely distributing the M files to N file servers, and deleting files except for the equally divided files from the N file servers;

sorting the files of the N file servers according to the size of the scores, feeding the files back to the user, and receiving a file deleting operation action executed by the user;

and after the user deletes the file, synchronizing the deletion action to the rest of other file servers.

It is another object of the present invention to provide a file server file processing system, the system comprising:

the scanning and scoring module is used for scanning the files of the file server cluster at regular time, and scoring the files and the servers respectively according to the scanning result to obtain a file score value and a server pressure value;

and the file processing module is used for processing abnormal files of the file server in a man-machine interaction mode by combining the obtained file score value and the server pressure value, wherein the abnormal files comprise garbage residual files and large files with low use frequency.

As an improved scheme, the scanning and scoring module specifically includes:

the information receiving module is used for receiving root directory information and scanning action setting information input by a user, wherein the scanning action setting information comprises scanning time and a scanning period;

the scanning module is used for scanning all files of the file server according to the received root directory information and scanning action setting information to acquire basic file information and basic server information, wherein the basic file information comprises the name, the size and the modification time of the file and the file server where the file server is located, and the basic server information comprises the storage utilization rate and the storage growth rate of the server and the state information of a memory and a CPU (Central processing Unit);

the file scoring module is used for scoring all files of the file server according to the acquired basic information of the files to acquire a file score value of each file;

and the server scoring module is used for scoring the file server according to the acquired basic server information to acquire a server pressure value.

G＝e^-(x-10)+y+z

wherein G is a file score value, x is a total file reference frequency, y is a time difference between server time and file modification time during scanning, and z represents the size of a file and is in G;

the calculation formula for scoring the file server is as follows:

V＝40*x₁+40*x₂+10*x₃+10*x₄

As an improved scheme, the file processing module specifically includes:

the first comparison and judgment module is used for comparing the obtained server pressure value with a preset controllable range area of the pressure value and judging whether the server pressure value is in the controllable range area or not;

the return execution control module is used for scanning the files of the file server cluster at regular time and respectively scoring the files and the servers according to the scanning result to obtain a file score value and a server pressure value when the server pressure value is judged to be in the controllable range area of the pressure value;

the second comparison and judgment module is used for judging whether the server pressure value is in a pressure value serious range area or not when the server pressure value is judged to exceed the pressure value controllable range area;

the first processing module is used for sorting all files according to score values when the server pressure value is judged to be in the severe pressure value range area, feeding back the sorted file information and the corresponding files to a user, and receiving a file deleting operation action executed by the user;

the second processing module is used for executing high-availability processing based on file transfer on the file servers of the server cluster when the server pressure value is judged to exceed the pressure value severe range area;

wherein, the second processing module specifically comprises:

the third processing module is used for sorting all files according to score values when the server pressure value is judged to exceed the severe range area of the pressure value, feeding back the sorted file information and the corresponding files to the user, and receiving a file deleting operation action executed by the user;

the available space acquisition module is used for analyzing the available space of the file servers in the server cluster while feeding back the sorted file information and the corresponding files to the user, and acquiring the residual available space of each file server;

the selecting module is used for selecting N files from a plurality of file servers and simultaneously selecting M files with the top scores, wherein N and M are natural numbers, and M is more than or equal to N;

the fourth processing module is used for equally dividing the M files into N parts, averagely distributing the M files to the N file servers, and deleting the files except the evenly divided files from the N file servers;

the fifth processing module is used for sorting the files of the N file servers according to the scores, feeding the files back to the user and receiving a file deleting operation action executed by the user;

and the synchronization module is used for synchronizing the deletion action to the rest of other file servers after the user deletes the file.

In the embodiment of the invention, files of a file server cluster are scanned at regular time, and the files and the servers are respectively scored according to the scanning result to obtain a file score value and a server pressure value; combining the obtained file score value and the server pressure value, processing abnormal files of the file server in a man-machine interaction mode, wherein the abnormal files comprise garbage residual files and large files with low use frequency, so that the garbage files which are not used frequently and have large capacity on the file server are processed, the use efficiency of the file server is improved, and the cost for purchasing the file server is saved.

Drawings

In order to more clearly illustrate the detailed description of the invention or the technical solutions in the prior art, the drawings that are needed in the detailed description of the invention or the prior art will be briefly described below. Throughout the drawings, like elements or portions are generally identified by like reference numerals. In the drawings, elements or portions are not necessarily drawn to scale.

FIG. 1 is a flow chart of an implementation of a file server file processing method provided by the present invention;

FIG. 2 is a flowchart illustrating an implementation of scanning files of a file server cluster at regular time, and scoring the files and the servers respectively according to the scanning result to obtain a file score value and a server pressure value;

FIG. 3 is a flowchart illustrating an implementation of processing an abnormal file of a file server in a human-computer interaction manner according to the file score value and the server pressure value obtained in the foregoing manner;

FIG. 4 is a flow chart illustrating an implementation of a file transfer-based high availability process for file servers of a server cluster when it is determined that the server pressure value exceeds the pressure value critical range area;

FIG. 5 is a block diagram of a file server file processing system provided by the present invention;

FIG. 6 is a block diagram of a scanning scoring module provided by the present invention;

FIG. 7 is a block diagram of a file processing module provided by the present invention;

fig. 8 is a block diagram of a second processing module provided in the present invention.

Detailed Description

Embodiments of the present invention will be described in detail below with reference to the accompanying drawings. The following examples are merely for illustrating the technical solutions of the present invention more clearly, and therefore are only examples, and the protection scope of the present invention is not limited thereby.

Fig. 1 is a flowchart of an implementation of a file processing method of a file server provided by the present invention, which specifically includes the following steps:

in step S101, scanning files of a file server cluster at regular time, and scoring the files and the servers respectively according to the scanning result to obtain a file score value and a server pressure value;

in step S102, the obtained file score value and the server pressure value are combined, and an abnormal file of the file server is processed in a human-computer interaction manner, where the abnormal file includes a garbage residual file and a large file with a low frequency of use.

In the embodiment of the present invention, as shown in fig. 2, the step of scanning the files of the file server cluster at regular time, and scoring the files and the servers respectively according to the scanning result to obtain the file score value and the server pressure value specifically includes the following steps:

in step S201, receiving root directory information and scanning action setting information input by a user, wherein the scanning action setting information includes a scanning time and a scanning period;

in this step, the user first needs to input the root directory on the interface, and set the auto-scan time and scan period, or set the threshold range, and use the default value if not set.

In step S202, scanning all files of the file server according to the received root directory information and the scanning action setting information, and acquiring file basic information and server basic information, where the file basic information includes a name, a size, a modification time, and a file server where the file is located, and the server basic information includes a storage utilization rate, a storage growth rate, and state information of a memory and a CPU of the server;

in step S203, according to the acquired basic information of the file, scoring all files of the file server to acquire a file score value of each file;

in this step, the calculation formula for scoring all files of the file server is:

G＝e^-(x-10)+y+z

wherein G is the score value of the file, the upper limit is temporarily absent, when the file is finally recommended to a user, the scores are ranked from high to low, where x represents the total number of references (counted from file creation, e.g., downloaded, viewed, etc., are recorded once), first looks at the expiration time for the last reference to the file, if the time difference with the current time is more than 3 years, the score is set as the highest score of all the current file scores, according to experience, after the number of references is more than 10, the files are probably the files which are being used, so the score of the file with the number of references more than 10 is rapidly reduced, if the score is less than 10, the score is high, x is the total number of references of the files, y is the time difference between the server time and the file modification time during scanning, taking day as a unit, z represents the size of the file, and taking g as a unit, the reason that the difference between the number of times of reference and the time and the size is large is that the number of times of reference of the file has the greatest influence on the deletion of the file. Meanwhile, the scanning module uploads performance data of all servers, such as CPU, memory, storage utilization rate, storage growth rate and the like.

In step S204, the file server is scored according to the acquired server basic information, so as to obtain a server pressure value.

In this step, the calculation formula for scoring the file server is as follows:

V＝40*x₁+40*x₂+10*x₃+10*x₄

wherein V is a server pressure value, x₁The representation is the storage usage, x, of the server₂Represented by the ratio of the amount of memory growth to the total capacity, x₃Representative is the average CPU utilization, x, during the last and next scans₄Representing the average memory utilization during the last scan and the period of the scan. The unknowns for each option range from 0-1, and the coefficients for each term in the formula are empirically derived results, since the file server is primarily concerned with the storage capacity of its file serverTherefore, the proportion of the storage utilization rate and the storage growth rate of the file service is highest, and the CPU and the memory are necessary reference data of the file server when processing files, and the performance of the file service is also influenced when the pressure of the CPU and the memory of the file server is too large. And obtaining a pressure value after calculation, and then processing the data according to a certain rule, thereby achieving the purpose of optimizing the file server.

In the embodiment of the present invention, as shown in fig. 3, the step of processing the abnormal file of the file server in a human-computer interaction manner by combining the obtained file score value and the server pressure value specifically includes the following steps:

in step S301, comparing the obtained server pressure value with a preset pressure value controllable range area, and determining whether the server pressure value is within the controllable range area, if so, executing step S302, otherwise, executing step S303;

in step S302, when it is determined that the server pressure value is within the controllable range of the pressure value, scanning files of the file server cluster at the fixed time, and scoring the files and the servers according to the scanning result to obtain a file score value and a server pressure value;

in step S303, when it is determined that the server pressure value exceeds the controllable range of pressure values, determining whether the server pressure value is within a severe range of pressure values, if so, performing step S304, otherwise, performing step S305;

in step S304, when it is determined that the server pressure value is within the severe pressure value range area, sorting all files according to score values, feeding back the sorted file information and corresponding files to the user, and receiving a file deletion operation performed by the user;

in step S305, when it is determined that the server pressure value exceeds the pressure value severe range area, a file transfer-based high availability process is performed on the file servers of the server cluster.

In this embodiment, the following steps need to be executed before the above step S301 is executed:

setting a pressure value severe range area and a pressure value controllable range area according to the operation situation of the file server, for example, the usage rate of the storage space in the pressure value controllable range area is 60% -80%, and the usage rate of the storage space in the pressure value severe range area is 80% -90%, which is not described herein again.

In this embodiment of the present invention, as shown in fig. 4, when it is determined that the server pressure value exceeds the pressure value severe range area, the step of performing file transfer-based high availability processing on the file servers of the server cluster specifically includes the following steps:

in step S401, when it is determined that the server pressure value exceeds the pressure value severe range area, sorting all files according to score values, feeding back sorted file information and corresponding files to a user, and receiving a file deletion operation action performed by the user;

in step S402, while feeding back the sorted file information and the corresponding files to the user, analyzing the available space of the file servers in the server cluster, and obtaining the remaining available space of each file server;

in step S403, N of the plurality of file servers are selected, and M files with top scores are simultaneously selected, where N and M are natural numbers, and M is greater than or equal to N;

in step S404, equally dividing the M files into N, equally allocating the N files to N file servers, and deleting files other than the equally divided files from the N file servers;

in step S405, sorting the files of the N file servers according to the size of the score, feeding the sorted files back to the user, and receiving a file deletion operation action performed by the user;

in step S406, when the user deletes the file, the deletion operation is synchronized to the remaining other file servers.

The above is given as a specific embodiment, but other ways are of course possible, such as: after the remaining available space of the file server is screened, the file server with the largest remaining available space is selected as a conversion server, the selected file is transferred to the file server with the largest remaining available space, the operation of deleting by the user is executed, and then synchronous real-time scanning is performed, which is not described herein again.

In the embodiment of the invention, a plurality of users can be set, and the mode can also be set as a short message and mail mode to inform the users, so that the users can conveniently check the use condition of the file server at any time, the users can also actively inquire the use condition of each file server, thereby mastering and processing data at any time, and the users can process the data in the short message or mail mode, thereby being convenient for the users to use.

Fig. 5 is a block diagram illustrating a file server file processing system according to the present invention, and for convenience of explanation, only the parts related to the embodiment of the present invention are shown in the diagram.

The file server file processing system includes:

the scanning and scoring module 11 is configured to scan files of the file server cluster at regular time, and score the files and the servers respectively according to a scanning result to obtain a file score value and a server pressure value;

and the file processing module 12 is configured to process abnormal files of the file server in a human-computer interaction manner by combining the obtained file score value and the server pressure value, where the abnormal files include garbage residual files and large files with low use frequency.

In this embodiment, as shown in fig. 6, the scanning and scoring module 11 specifically includes:

the information receiving module 13 is configured to receive root directory information and scanning action setting information input by a user, where the scanning action setting information includes scanning time and scanning period;

the scanning module 14 is configured to scan all files of the file server according to the received root directory information and scanning action setting information, and obtain basic file information and basic server information, where the basic file information includes a name, a size, a modification time, and a file server where the file server is located, and the basic server information includes a storage utilization rate, a storage growth rate, and state information of a memory and a CPU of the server;

the file scoring module 15 is configured to score all files of the file server according to the obtained basic information of the files, and obtain a file score value of each file;

and the server scoring module 16 is configured to score the file server according to the obtained server basic information, so as to obtain a server pressure value.

In the embodiment of the present invention, a calculation formula for scoring all files of the file server is as follows:

G＝e^-(x-10)+y+z

the calculation formula for scoring the file server is as follows:

V＝40*x₁+40*x₂+10*x₃+10*x₄

In this embodiment of the present invention, as shown in fig. 7, the file processing module 12 specifically includes:

the first comparison and judgment module 17 is configured to compare the obtained server pressure value with a preset controllable range area of the pressure value, and judge whether the server pressure value is in the controllable range area;

the return execution control module 18 is configured to, when it is determined that the server pressure value is within the controllable range of the pressure value, execute the step of scanning the files of the file server cluster at the fixed time, and score the files and the servers according to the scanning result to obtain a file score value and a server pressure value;

the second comparison and judgment module 19 is used for judging whether the server pressure value is in a pressure value serious range area or not when the server pressure value is judged to exceed the pressure value controllable range area;

the first processing module 20 is configured to sort all files according to the score values, feed back sorted file information and corresponding files to the user, and receive a file deletion operation action executed by the user when it is determined that the server pressure value is within the severe pressure value range area;

the second processing module 21 is configured to, when it is determined that the server pressure value exceeds the pressure value severe range area, perform file transfer-based high-availability processing on file servers of the server cluster;

as shown in fig. 8, the second processing module 21 specifically includes:

the third processing module 22 is configured to sort all files according to score values when it is determined that the server pressure value exceeds the severe range area of the pressure value, feed back sorted file information and corresponding files to the user, and receive a file deletion operation action executed by the user;

the available space obtaining module 23 is configured to, while feeding back the sorted file information and the corresponding files to the user, analyze available spaces of the file servers in the server cluster, and obtain remaining available spaces of the file servers;

a selecting module 24, configured to select N of the plurality of file servers, and simultaneously select M files with top scores, where N and M are natural numbers, and M is greater than or equal to N;

a fourth processing module 25, configured to equally divide the M files into N, equally allocate the M files to the N file servers, and delete the files other than the equally divided files from the N file servers;

a fifth processing module 26, configured to sort the files of the N file servers according to the scores, feed the sorted files back to the user, and receive a file deletion operation action performed by the user;

and the synchronization module 27 is configured to synchronize the deletion action to the remaining other file servers after the user deletes the file.

The implementation of each module is described in the above method embodiments, and is not described herein again.

The above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit and scope of the present invention, and they should be construed as being included in the following claims and description.

Claims

1. A file server file processing method, comprising the steps of:

2. The file server file processing method according to claim 1, wherein the step of scanning the files of the file server cluster at regular time and scoring the files and the servers respectively according to the scanning result to obtain a file score value and a server pressure value specifically comprises the steps of:

3. The file server file processing method according to claim 2, wherein a calculation formula for scoring all files of the file server is:

G＝e^-(x-10)|y|z

4. The file server file processing method of claim 2, wherein the formula for scoring the file server is:

V＝40*x₁+40*x₂+10*x₃+10*x₄

5. The file server file processing method according to claim 2, wherein the step of processing the abnormal file of the file server in a man-machine interaction manner by combining the obtained file score value and the server pressure value specifically comprises the steps of:

6. The file server file processing method according to claim 5, wherein the step of performing file transfer-based high availability processing on the file servers of the server cluster when it is determined that the server pressure value exceeds the pressure value severe range area specifically comprises the following steps:

7. A file server file processing system, the system comprising:

8. The file server file processing system of claim 7, wherein the scan scoring module specifically comprises:

9. The file server file processing system of claim 8, wherein the formula for scoring all files of the file server is:

G＝e^-(x-10)+y+z

the calculation formula for scoring the file server is as follows:

V＝40*x₁+40*x₂+10*x₃+10*x₄

10. The file server file processing method according to claim 2, wherein the file processing module specifically comprises:

wherein, the second processing module specifically comprises: