CN117331501A - Data analysis management method, equipment and system for solid state disk - Google Patents
Data analysis management method, equipment and system for solid state disk Download PDFInfo
- Publication number
- CN117331501A CN117331501A CN202311275018.8A CN202311275018A CN117331501A CN 117331501 A CN117331501 A CN 117331501A CN 202311275018 A CN202311275018 A CN 202311275018A CN 117331501 A CN117331501 A CN 117331501A
- Authority
- CN
- China
- Prior art keywords
- file
- target
- cleaning
- storage library
- files
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 239000007787 solid Substances 0.000 title claims abstract description 50
- 238000007405 data analysis Methods 0.000 title claims abstract description 23
- 238000007726 management method Methods 0.000 title claims description 13
- 238000004140 cleaning Methods 0.000 claims abstract description 109
- 230000006978 adaptation Effects 0.000 claims abstract description 50
- 238000000034 method Methods 0.000 claims abstract description 44
- 238000004458 analytical method Methods 0.000 claims abstract description 40
- 238000012545 processing Methods 0.000 claims abstract description 14
- 238000012986 modification Methods 0.000 claims description 40
- 230000004048 modification Effects 0.000 claims description 40
- 238000011156 evaluation Methods 0.000 claims description 29
- 238000012544 monitoring process Methods 0.000 claims description 24
- 238000013523 data management Methods 0.000 claims description 13
- 238000012937 correction Methods 0.000 claims description 12
- 238000004891 communication Methods 0.000 claims description 9
- 238000004364 calculation method Methods 0.000 claims description 5
- 230000004308 accommodation Effects 0.000 claims description 3
- 230000011218 segmentation Effects 0.000 claims description 3
- 238000004141 dimensional analysis Methods 0.000 abstract description 3
- 238000012790 confirmation Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 238000007792 addition Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0608—Saving storage space on storage systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/16—File or folder operations, e.g. details of user interfaces specifically adapted to file systems
- G06F16/162—Delete operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0638—Organizing or formatting or addressing of data
- G06F3/0643—Management of files
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0646—Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
- G06F3/0652—Erasing, e.g. deleting, data cleaning, moving of data to a wastebasket
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0673—Single storage device
- G06F3/0679—Non-volatile semiconductor memory device, e.g. flash memory, one time programmable memory [OTP]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to the technical field of data analysis management of solid state disks, and particularly discloses a method, equipment and a system for data analysis management of a solid state disk, wherein the method comprises file classification matching, file storage library cleaning requirement analysis, file cleaning adaptation analysis and file processing to be cleaned; according to the method, the files are classified, the cleaning requirement trend indexes corresponding to the file storage libraries are analyzed by combining the byte numbers of the files and the file growth trend, the cleaning adaptation indexes are analyzed for the target files, and finally the files to be cleaned are processed to release the memory of the solid state disk in the target computer, so that the problem of limitation in analysis and management of file data at present is effectively solved, multi-dimensional analysis of the files to be cleaned is realized, the memory of the solid state disk can meet the requirements of users, and meanwhile, enough space is reserved for storing backup data, and the possibility of data loss and incapability of recovering the backup data is reduced.
Description
Technical Field
The invention relates to the technical field of data analysis and management of solid state disks, in particular to a method, equipment and a system for data analysis and management of a solid state disk.
Background
The solid state disk is a hard disk made of a solid state electronic memory chip array, and is composed of a control unit, a memory unit and a buffer unit, the solid state disk on a computer can be used as a system disk to accelerate the starting speed of the system and the application, and also can be used as a memory disk to provide faster data access speed, however, when the memory of the solid state disk is insufficient, there is insufficient space for storing backup data, which may cause data loss and failure in recovering the backup data, so that analysis and management are required to be performed on file data in the solid state disk of the computer to release the memory of the solid state disk, thereby ensuring safe and stable operation of the solid state disk of the computer.
The existing analysis and management of file data in a solid state disk of a computer mainly analyzes the byte number and access times of the file, and obviously, the existing analysis and management of the file data has the following problems: 1. only a single file is analyzed, the files are not classified, and the file storage library is subjected to cleaning requirement trend analysis without combining with file growth trend, so that the increase situation of the files in the file storage library cannot be intuitively displayed, the accuracy of cleaning requirement trend confirmation of the file storage library is reduced, and further, the memory release effect of the solid state disk in the target computer is not obvious.
2. Only the access times and the modification times of the files are considered, the frequent access conditions and the frequent modification conditions of the files are not considered, the consideration level is not comprehensive enough, the cleaning adaptation conditions of all the target files cannot be accurately known, larger errors exist in the cleaning adaptation index analysis corresponding to all the target files, and the credibility of the cleaning adaptation analysis corresponding to all the target files cannot be improved.
Disclosure of Invention
Aiming at the problems, the invention aims to provide a method, equipment and a system for data analysis management of a solid state disk, which effectively solve the problems mentioned in the background art.
The technical scheme adopted for solving the technical problems is as follows: in a first aspect, the present invention provides a method for data analysis and management of a solid state disk, including the following steps: s1, file classification matching: and extracting the file names of the files in the target computer, constructing each file storage library, and classifying and matching the files according to the file names of the files so as to store the files into the corresponding file storage libraries.
S2, analyzing cleaning requirements of a file repository: and (3) extracting the number of accommodated bytes of the solid state disk in the target computer, extracting the number of storage files and the number of access times corresponding to each monitoring day of each file storage library and the number of bytes of each file in each file storage library, analyzing the cleaning demand trend index corresponding to each file storage library, and when the cleaning demand trend index corresponding to a certain file storage library is greater than or equal to a set value, marking the file storage library as the target cleaning storage library, and executing the step (S3).
S3, file cleaning adaptation analysis: and marking each file in each target cleaning storage library as each target file, extracting accumulated storage time length and operation information of each target file, analyzing cleaning adaptation indexes corresponding to each target file, indicating that a certain target file is a file to be cleaned when the cleaning adaptation index corresponding to the target file is greater than or equal to a set value, and executing the step S4.
S4, processing the file to be cleaned: and deleting each file to be cleaned in each target cleaning storage library.
Specifically, the specific implementation process of storing each file into the corresponding file repository is as follows: a1, extracting text information from file names of the files.
A2, word segmentation is carried out on the extracted text information corresponding to each file, and each phrase corresponding to each file is obtained.
And A3, matching each phrase corresponding to each file with a keyword library associated with each file storage library stored in the cloud database, and if a phrase corresponding to a certain file is positioned in a certain keyword library, taking the file storage library associated with the keyword library as a file storage library of the file, so that each file is stored in a corresponding file storage library.
Specifically, the analyzing the cleaning demand trend index corresponding to each file repository includes the following specific analysis processes: b1, calculating a file growth trend evaluation index beta of each file storage library according to the number of storage files corresponding to each monitoring day of each file storage library i Where i denotes the number of the file repository, i=1, 2,..n.
B2, accumulating the access times of each file repository corresponding to each monitoring day to obtain each fileThe total number of accesses to the repository is denoted as eta i 。
B3, accumulating the byte numbers of the files in the file storage libraries to obtain the comprehensive file byte number of the file storage libraries, and marking the comprehensive file byte number as epsilon i 。
B4, recording the number of the accommodation bytes of the solid state disk in the target computer as epsilon Total (S) 。
B5, calculating a cleaning demand trend index χ corresponding to each file repository i ,Wherein beta', tau 1 And eta' respectively represent the file growth trend evaluation index, the comprehensive file byte number ratio and the total access times of the set reference, a 1 、a 2 And a 3 And respectively representing the set file growth trend evaluation index, the comprehensive file byte number occupation ratio and the cleaning demand trend index evaluation occupation ratio weight corresponding to the total access times, wherein e represents a natural constant.
Specifically, the calculating the file growth trend evaluation index of each file repository comprises the following specific calculating processes: c1, constructing a file growth curve of each file storage library by taking a monitoring day as an abscissa and the number of stored files as an ordinate, positioning a slope value from the curve, and marking the slope value as a file growth rate of each file storage library as K i 。
C2, setting file growth rate correction factor lambda of each file repository i 。
C3, calculating the file growth trend evaluation index beta of each file repository i ,Wherein K is i ' indicates the file growth rate of the ith file repository to which the reference is set.
Specifically, the setting of the file growth rate correction factor of each file repository includes the following specific setting processes: d1, taking the starting point of the file growth curve of each file storage library as a base point, taking the file growth rate of a set reference as a slope, and the files in each file storage libraryConstructing a reference datum line in the growth curve, locating the number of monitoring days below the reference datum line from the file growth curve of each file repository, and recording the number of monitoring days as the deviation number as M i 。
D2, locating the amplitude of the file growth curve from the file growth curves of the file stores, and marking as H i 。
D3, setting file growth rate correction factor lambda of each file repository i ,Wherein M 'and H' respectively represent the deviation number of the set reference and the amplitude of the file growth curve, a 4 And a 5 The set deviation number and the amplitude deviation of the file growth curve are respectively represented to correspond to the file growth rate correction factor evaluation duty ratio weight.
Specifically, the operation information includes a time point corresponding to each access and a time point corresponding to each modification.
Specifically, the cleaning adaptation index corresponding to each target file is analyzed, and the specific analysis process is as follows: and E1, extracting a time point corresponding to each access of each target file and a time point corresponding to each modification from the operation information.
E2, according to the corresponding time points of each access of each target file, calculating the access frequency index delta of each target file j Where j represents the number of the target file, j=1, 2,..m.
E3, according to the calculation mode of the access frequency index of each target file, the modification frequency index omega of each target file is calculated in the same way j 。
E4, marking the accumulated storage time of each target file as
E5, calculating the cleaning adaptation index corresponding to each target file Wherein, delta ', omega' and T Storing the articles Respectively showing access frequency index, modification frequency index and accumulated storage duration of setting reference, a) 6 、a 7 And a 8 And respectively representing the set access frequency index, the set modification frequency index and the corresponding cleaning adaptation index evaluation duty ratio weight of the accumulated storage duration.
Specifically, the calculating the access frequency index of each target file includes the following specific calculating processes: and F1, comparing the time points corresponding to the accesses of the target files to obtain the time intervals corresponding to the accesses of the target files.
F2, comparing the time interval corresponding to each access of each target file with the access time interval of the set reference, if the time interval corresponding to a certain access of a certain target file is smaller than the access time interval of the set reference, marking the access as target access, counting the target access times of each target file, and marking as rho j 。
F3, extracting the minimum value from the time interval corresponding to each access of each target file, and marking as T j 。
F4, calculating access frequency index delta of each target file j ,Wherein ρ 'and T' respectively represent the target access times and access time intervals of the set reference, b 1 And b 2 And respectively representing the set access frequency index evaluation duty ratio weight corresponding to the target access times and the access time intervals.
A second aspect of the invention proposes an apparatus comprising: processor, memory and communication bus.
The memory has stored thereon a computer readable program executable by the processor.
The communication bus enables connection communication between the processor and the memory.
The steps in a method for data analysis and management of a solid state disk according to any one of claims 1 to 8 are implemented when the processor executes the computer readable program.
A third aspect of the present invention provides a data analysis management system for a solid state disk, including: the file classification matching module is used for extracting the file names of the files in the target computer, constructing each file storage library, and classifying and matching the files according to the file names of the files so as to store the files into the corresponding file storage libraries.
And the cloud database is used for storing keyword libraries associated with each file storage library.
The file storage library cleaning demand analysis module is used for extracting the number of accommodated bytes of the solid state disk in the target computer, extracting the number of storage files and the number of access times corresponding to each monitoring day of each file storage library and the number of bytes of each file in each file storage library, analyzing the cleaning demand trend index corresponding to each file storage library, and when the cleaning demand trend index corresponding to a certain file storage library is greater than or equal to a set value, marking the file storage library as the target cleaning storage library and executing the file cleaning adaptation analysis module.
The file cleaning adaptation analysis module is used for marking each file in each target cleaning storage library as each target file, extracting accumulated storage time length and operation information of each target file, so as to analyze cleaning adaptation indexes corresponding to each target file, and when the cleaning adaptation index corresponding to a certain target file is greater than or equal to a set value, indicating that the target file is a file to be cleaned, and executing the file processing module to be cleaned.
And the file processing module to be cleaned is used for deleting the files to be cleaned in each target cleaning storage library.
Compared with the prior art, the invention has the following advantages and positive effects: (1) According to the method, the files are classified, the cleaning requirement trend indexes corresponding to the file storage libraries are analyzed by combining the byte numbers of the files and the file growth trend, the cleaning adaptation indexes are analyzed for the target files, and finally the files to be cleaned are processed to release the memory of the solid state disk in the target computer, so that the problem of limitation in analysis and management of file data at present is effectively solved, multi-dimensional analysis of the files to be cleaned is realized, the memory of the solid state disk can meet the requirements of users, and meanwhile, enough space is reserved for storing backup data, and the possibility of data loss and incapability of recovering the backup data is reduced.
(2) According to the method, the file growth trend evaluation index of each file storage library is calculated according to the number of storage files corresponding to each monitoring day of each file storage library, so that the cleaning demand trend index corresponding to each file storage library is calculated, the increase condition of the files in the file storage library is intuitively displayed, the accuracy of the cleaning demand trend confirmation of the file storage library is improved, and the memory release effect of the solid state disk in the target computer is further improved.
(3) According to the method and the device, the access frequency index and the modification frequency index of each target file are calculated according to the access times, the time points corresponding to each access and the time points corresponding to each modification of each target file, so that the cleaning adaptation index corresponding to each target file is analyzed, the consideration level is comprehensive, the coverage of cleaning adaptation analysis of each target file is expanded, the cleaning adaptation condition of each file is accurately known, the error of the cleaning adaptation analysis corresponding to each target file is reduced, the reliability of the cleaning adaptation analysis corresponding to each target file is improved, and meanwhile, a reliable decision basis is provided for the processing of the files to be cleaned.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of the method of the present invention.
Fig. 2 is a system module connection diagram of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1
Referring to fig. 1, a method for data analysis and management of a solid state disk includes the following steps: s1, file classification matching: and extracting the file names of the files in the target computer, constructing each file storage library, and classifying and matching the files according to the file names of the files so as to store the files into the corresponding file storage libraries.
The file names of the files are extracted from the system of the target computer.
In a specific embodiment of the present invention, the specific implementation process of storing each file in the corresponding file repository is as follows: a1, extracting text information from file names of the files.
A2, word segmentation is carried out on the extracted text information corresponding to each file, and each phrase corresponding to each file is obtained.
And A3, matching each phrase corresponding to each file with a keyword library associated with each file storage library stored in the cloud database, and if a phrase corresponding to a certain file is positioned in a certain keyword library, taking the file storage library associated with the keyword library as a file storage library of the file, so that each file is stored in a corresponding file storage library.
S2, analyzing cleaning requirements of a file repository: and (3) extracting the number of accommodated bytes of the solid state disk in the target computer, extracting the number of storage files and the number of access times corresponding to each monitoring day of each file storage library and the number of bytes of each file in each file storage library, analyzing the cleaning demand trend index corresponding to each file storage library, and when the cleaning demand trend index corresponding to a certain file storage library is greater than or equal to a set value, marking the file storage library as the target cleaning storage library, and executing the step (S3).
The number of bytes contained in the solid state disk, the number of stored files and the number of accesses corresponding to each monitoring day in each file storage library, and the number of bytes in each file storage library are all extracted from the system of the target computer.
In a specific embodiment of the present invention, the analyzing the cleaning requirement trend index corresponding to each file repository includes: b1, calculating a file growth trend evaluation index beta of each file storage library according to the number of storage files corresponding to each monitoring day of each file storage library i Where i denotes the number of the file repository, i=1, 2,..n.
In a specific embodiment of the present invention, the calculating the file growth trend evaluation index of each file repository includes: c1, constructing a file growth curve of each file storage library by taking a monitoring day as an abscissa and the number of stored files as an ordinate, positioning a slope value from the curve, and marking the slope value as a file growth rate of each file storage library as K i 。
C2, setting file growth rate correction factor lambda of each file repository i 。
In a specific embodiment of the present invention, the setting of the file growth rate correction factor of each file repository includes the following specific setting process: d1, taking the starting point of the file growth curve of each file storage library as a base point, taking the file growth rate of a set reference as a slope, constructing a reference datum line in the file growth curve of each file storage library, positioning the number of monitoring days below the reference datum line from the file growth curve of each file storage library, taking the number of monitoring days as the deviation number, and recording as M i 。
D2, locating the amplitude of the file growth curve from the file growth curves of the file stores, and marking as H i 。
D3, setting file growth rate correction factor lambda of each file repository i ,Wherein M 'and H' are dividedIndicating the deviation number of the set reference and the amplitude of the file growth curve, a respectively 4 And a 5 The set deviation number and the amplitude deviation of the file growth curve are respectively represented to correspond to the file growth rate correction factor evaluation duty ratio weight.
C3, calculating the file growth trend evaluation index beta of each file repository i ,Wherein K is i ' indicates the file growth rate of the ith file repository to which the reference is set.
B2, accumulating the access times of each file storage library corresponding to each monitoring day to obtain the total access times of each file storage library, and marking the total access times as eta i 。
B3, accumulating the byte numbers of the files in the file storage libraries to obtain the comprehensive file byte number of the file storage libraries, and marking the comprehensive file byte number as epsilon i 。
B4, recording the number of the accommodation bytes of the solid state disk in the target computer as epsilon Total (S) 。
B5, calculating a cleaning demand trend index χ corresponding to each file repository i ,Wherein beta', tau 1 And eta' respectively represent the file growth trend evaluation index, the comprehensive file byte number ratio and the total access times of the set reference, a 1 、a 2 And a 3 And respectively representing the set file growth trend evaluation index, the comprehensive file byte number occupation ratio and the cleaning demand trend index evaluation occupation ratio weight corresponding to the total access times, wherein e represents a natural constant.
According to the embodiment of the invention, the file growth trend evaluation index of each file storage library is calculated according to the number of the storage files corresponding to each monitoring day of each file storage library, so that the cleaning demand trend index corresponding to each file storage library is calculated, the increase condition of the files in the file storage library is intuitively displayed, the accuracy of the cleaning demand trend confirmation of the file storage library is improved, and the memory release effect of the solid state disk in the target computer is further improved.
S3, file cleaning adaptation analysis: and marking each file in each target cleaning storage library as each target file, extracting accumulated storage time length and operation information of each target file, analyzing cleaning adaptation indexes corresponding to each target file, indicating that a certain target file is a file to be cleaned when the cleaning adaptation index corresponding to the target file is greater than or equal to a set value, and executing the step S4.
In a specific embodiment of the present invention, the operation information includes a time point corresponding to each access and a time point corresponding to each modification.
It should be noted that, the accumulated storage duration of each target file, the time point corresponding to each access, and the time point corresponding to each modification are all extracted from the system of the target computer.
In a specific embodiment of the present invention, the analyzing the cleaning adaptation index corresponding to each target file includes: and E1, extracting a time point corresponding to each access of each target file and a time point corresponding to each modification from the operation information.
E2, according to the corresponding time points of each access of each target file, calculating the access frequency index delta of each target file j Where j represents the number of the target file, j=1, 2,..m.
In a specific embodiment of the present invention, the calculating the access frequency index of each target file includes: and F1, comparing the time points corresponding to the accesses of the target files to obtain the time intervals corresponding to the accesses of the target files.
F2, comparing the time interval corresponding to each access of each target file with the access time interval of the set reference, if the time interval corresponding to a certain access of a certain target file is smaller than the access time interval of the set reference, marking the access as target access, counting the target access times of each target file, and marking as rho j 。
F3, extracting the minimum value from the time interval corresponding to each access of each target file, and marking as T j 。
F4, calculating access frequency index delta of each target file j ,Wherein ρ 'and T' respectively represent the target access times and access time intervals of the set reference, b 1 And b 2 And respectively representing the set access frequency index evaluation duty ratio weight corresponding to the target access times and the access time intervals.
E3, according to the calculation mode of the access frequency index of each target file, the modification frequency index omega of each target file is calculated in the same way j 。
It should be noted that, the calculating the modification frequent index of each target file specifically includes: and G1, comparing the time points corresponding to the modifications of each target file to obtain the time interval corresponding to the modifications of each target file.
G2, comparing the time interval corresponding to each modification of each target file with the modification time interval of the set reference, if the time interval corresponding to a certain modification of a certain target file is smaller than the modification time interval of the set reference, marking the modification as target modification, counting the target modification times of each target file, and marking the target modification times as sigma j 。
G3, extracting the minimum value from the time interval corresponding to each modification of each target file, and marking as T j ′。
G4, calculating the modification frequent index omega of each target file j ,Wherein σ 'and T' represent the target modification times and modification time intervals, b, respectively, of the set reference 3 And b 4 And respectively representing the set target modification times and modification time intervals and correspondingly modifying the frequent index evaluation duty ratio weight.
E4, marking the accumulated storage time of each target file as
E5, calculating the cleaning adaptation index corresponding to each target file Wherein, delta ', omega' and T Storing the articles Respectively showing access frequency index, modification frequency index and accumulated storage duration of setting reference, a) 6 、a 7 And a 8 And respectively representing the set access frequency index, the set modification frequency index and the corresponding cleaning adaptation index evaluation duty ratio weight of the accumulated storage duration.
According to the method and the device, the access frequency index and the modification frequency index of each target file are calculated according to the access times, the time points corresponding to each access and the time points corresponding to each modification of each target file, so that the cleaning adaptation index corresponding to each target file is analyzed, the coverage of cleaning adaptation analysis of each target file is expanded in consideration of the comprehensive level, the cleaning adaptation condition of each file is accurately known, the error of the cleaning adaptation analysis corresponding to each target file is reduced, the reliability of the cleaning adaptation analysis corresponding to each target file is improved, and meanwhile, a reliable decision basis is provided for the processing of the files to be cleaned.
S4, processing the file to be cleaned: and deleting each file to be cleaned in each target cleaning storage library.
According to the embodiment of the invention, the files are classified, the cleaning requirement trend indexes corresponding to the file storage libraries are analyzed by combining the byte numbers of the files and the file growth trend, the cleaning adaptation indexes are analyzed for each target file, and finally the files to be cleaned are processed to release the memory of the solid state disk in the target computer, so that the problem of limitation in the current analysis and management of the file data is effectively solved, the multi-dimensional analysis of the confirmation of the files to be cleaned is realized, the memory of the solid state disk can meet the requirements of users, and meanwhile, enough space is reserved for storing backup data, and the possibility of data loss and incapability of recovering the backup data is reduced.
Example 2
The invention proposes an apparatus comprising: processor, memory and communication bus.
The memory has stored thereon a computer readable program executable by the processor.
The communication bus enables connection communication between the processor and the memory.
The steps in the data analysis management method of any one of the solid state disks are implemented when the processor executes the computer readable program.
Example 3
Referring to fig. 2, the present invention provides a data analysis management system for a solid state disk, including: the system comprises a file classification matching module, a cloud database, a file repository cleaning demand analysis module, a file cleaning adaptation analysis module and a file processing module to be cleaned.
The file classification matching module is connected with the cloud database and the file storage library cleaning demand analysis module, the file storage library cleaning demand analysis module is connected with the file cleaning adaptation analysis module, and the file storage library cleaning demand analysis module and the file cleaning adaptation analysis module are connected with the file processing module to be cleaned.
The file classification matching module is used for extracting the file names of the files in the target computer, constructing each file storage library, and classifying and matching the files according to the file names of the files so as to store the files into the corresponding file storage libraries.
And the cloud database is used for storing keyword libraries associated with each file storage library.
The file storage library cleaning demand analysis module is used for extracting the number of accommodated bytes of the solid state disk in the target computer, extracting the number of storage files and the number of access times corresponding to each monitoring day of each file storage library and the number of bytes of each file in each file storage library, analyzing the cleaning demand trend index corresponding to each file storage library, and when the cleaning demand trend index corresponding to a certain file storage library is greater than or equal to a set value, marking the file storage library as the target cleaning storage library and executing the file cleaning adaptation analysis module.
The file cleaning adaptation analysis module is used for marking each file in each target cleaning storage library as each target file, extracting accumulated storage time length and operation information of each target file, so as to analyze cleaning adaptation indexes corresponding to each target file, and when the cleaning adaptation index corresponding to a certain target file is greater than or equal to a set value, indicating that the target file is a file to be cleaned, and executing the file processing module to be cleaned.
The to-be-cleaned file processing module is used for deleting each to-be-cleaned file in each target cleaning storage library.
The foregoing is merely illustrative and explanatory of the principles of this invention, as various modifications and additions may be made to the specific embodiments described, or similar arrangements may be substituted by those skilled in the art, without departing from the principles of this invention or beyond the scope of this invention as defined in the claims.
Claims (10)
1. The data analysis management method of the solid state disk is characterized by comprising the following steps of:
s1, file classification matching: extracting file names of all files in a target computer, constructing all file storage libraries, and classifying and matching all files according to the file names of all files so as to store all files into corresponding file storage libraries;
s2, analyzing cleaning requirements of a file repository: extracting the number of accommodated bytes of a solid state disk in a target computer, extracting the number of storage files and the number of access times corresponding to each monitoring day of each file storage library and the number of bytes of each file in each file storage library, analyzing the cleaning demand trend index corresponding to each file storage library, and when the cleaning demand trend index corresponding to a certain file storage library is greater than or equal to a set value, marking the file storage library as a target cleaning storage library, and executing the step S3;
s3, file cleaning adaptation analysis: recording each file in each target cleaning storage library as each target file, extracting accumulated storage time length and operation information of each target file, analyzing cleaning adaptation indexes corresponding to each target file, indicating that a certain target file is a file to be cleaned when the cleaning adaptation index corresponding to the target file is greater than or equal to a set value, and executing the step S4;
s4, processing the file to be cleaned: and deleting each file to be cleaned in each target cleaning storage library.
2. The method for data analysis and management of a solid state disk according to claim 1, wherein the method comprises the steps of: the specific implementation process of storing each file into the corresponding file storage library is as follows:
a1, extracting text information from file names of all files;
a2, word segmentation is carried out on the extracted text information corresponding to each file, and each phrase corresponding to each file is obtained;
and A3, matching each phrase corresponding to each file with a keyword library associated with each file storage library stored in the cloud database, and if a phrase corresponding to a certain file is positioned in a certain keyword library, taking the file storage library associated with the keyword library as a file storage library of the file, so that each file is stored in a corresponding file storage library.
3. The method for data analysis and management of a solid state disk according to claim 1, wherein the method comprises the steps of: the cleaning demand trend indexes corresponding to the file storage libraries are analyzed, and the specific analysis process is as follows:
b1, calculating a file growth trend evaluation index beta of each file storage library according to the number of storage files corresponding to each monitoring day of each file storage library i Where i represents the number of the file repository, i=1, 2, n;
b2, accumulating the access times of each file storage library corresponding to each monitoring day to obtain the total access times of each file storage library, and marking the total access times as eta i ;
B3, accumulating the byte numbers of the files in the file storage libraries to obtain the comprehensive file words of the file storage librariesThe number of nodes is denoted epsilon i ;
B4, recording the number of the accommodation bytes of the solid state disk in the target computer as epsilon Total (S) ;
B5, calculating a cleaning demand trend index χ corresponding to each file repository i ,Wherein beta', tau 1 And eta' respectively represent the file growth trend evaluation index, the comprehensive file byte number ratio and the total access times of the set reference, a 1 、a 2 And a 3 And respectively representing the set file growth trend evaluation index, the comprehensive file byte number occupation ratio and the cleaning demand trend index evaluation occupation ratio weight corresponding to the total access times, wherein e represents a natural constant.
4. The method for data analysis and management of a solid state disk according to claim 3, wherein the method comprises the steps of: the method comprises the following specific calculation processes of:
c1, constructing a file growth curve of each file storage library by taking a monitoring day as an abscissa and the number of stored files as an ordinate, positioning a slope value from the curve, and marking the slope value as a file growth rate of each file storage library as K i ;
C2, setting file growth rate correction factor lambda of each file repository i ;
C3, calculating the file growth trend evaluation index beta of each file repository i ,Wherein, K' i The file growth rate of the ith file repository to which the reference is set is shown.
5. The method for data analysis and management of a solid state disk according to claim 4, wherein the method comprises the steps of: the file growth rate correction factors of the file stores are set, and the specific setting process is as follows:
d1, taking the starting point of the file growth curve of each file storage library as a base point, taking the file growth rate of a set reference as a slope, constructing a reference datum line in the file growth curve of each file storage library, positioning the number of monitoring days below the reference datum line from the file growth curve of each file storage library, taking the number of monitoring days as the deviation number, and recording as M i ;
D2, locating the amplitude of the file growth curve from the file growth curves of the file stores, and marking as H i ;
D3, setting file growth rate correction factor lambda of each file repository i ,Wherein M 'and H' respectively represent the deviation number of the set reference and the amplitude of the file growth curve, a 4 And a 5 The set deviation number and the amplitude deviation of the file growth curve are respectively represented to correspond to the file growth rate correction factor evaluation duty ratio weight.
6. The method for data analysis and management of a solid state disk according to claim 1, wherein the method comprises the steps of: the operation information comprises a time point corresponding to each access and a time point corresponding to each modification.
7. The method for data analysis and management of a solid state disk according to claim 6, wherein the method comprises the steps of: the cleaning adaptation index corresponding to each target file is analyzed, and the specific analysis process is as follows:
e1, extracting a time point corresponding to each access of each target file and a time point corresponding to each modification from the operation information;
e2, according to the corresponding time points of each access of each target file, calculating the access frequency index delta of each target file j Where j represents the number of the target file, j=1, 2,..m;
e3, according to the calculation mode of the access frequency index of each target file, the modification frequency index omega of each target file is calculated in the same way j ;
E4, marking the accumulated storage time of each target file as
E5, calculating the cleaning adaptation index corresponding to each target file Wherein, delta ', omega' and T Storing the articles Respectively showing access frequency index, modification frequency index and accumulated storage duration of setting reference, a) 6 、a 7 And a 8 And respectively representing the set access frequency index, the set modification frequency index and the corresponding cleaning adaptation index evaluation duty ratio weight of the accumulated storage duration.
8. The method for data analysis and management of a solid state disk according to claim 7, wherein the method comprises the steps of: the method comprises the following specific calculation processes of:
f1, comparing time points corresponding to each access of each target file to obtain a time interval corresponding to each access of each target file;
f2, comparing the time interval corresponding to each access of each target file with the access time interval of the set reference, if the time interval corresponding to a certain access of a certain target file is smaller than the access time interval of the set reference, marking the access as target access, counting the target access times of each target file, and marking as rho j ;
F3, extracting the minimum value from the time interval corresponding to each access of each target file, and marking as T j ;
F4, calculating access frequency index delta of each target file j ,Wherein ρ 'and T' respectively represent the target access times and access time intervals of the set reference, b 1 And b 2 And respectively representing the set access frequency index evaluation duty ratio weight corresponding to the target access times and the access time intervals.
9. An apparatus, comprising: a processor, a memory, and a communication bus;
the memory has stored thereon a computer readable program executable by the processor;
the communication bus realizes connection communication between the processor and the memory;
the steps in a method for data analysis and management of a solid state disk according to any one of claims 1 to 8 are implemented when the processor executes the computer readable program.
10. The data analysis management system of the solid state disk is characterized by comprising the following components:
the file classification matching module is used for extracting the file names of all files in the target computer, constructing all file storage libraries, and classifying and matching all files according to the file names of all files so as to store all files into the corresponding file storage libraries;
the cloud database is used for storing keyword libraries associated with each file storage library;
the file storage library cleaning requirement analysis module is used for extracting the number of accommodated bytes of the solid state disk in the target computer, extracting the number of storage files and the number of access times corresponding to each monitoring day of each file storage library and the number of bytes of each file in each file storage library, analyzing the cleaning requirement trend index corresponding to each file storage library, and when the cleaning requirement trend index corresponding to a certain file storage library is greater than or equal to a set value, marking the file storage library as the target cleaning storage library and executing the file cleaning adaptation analysis module;
the file cleaning adaptation analysis module is used for marking each file in each target cleaning storage library as each target file, extracting accumulated storage time length and operation information of each target file, so as to analyze cleaning adaptation indexes corresponding to each target file, and when the cleaning adaptation index corresponding to a certain target file is greater than or equal to a set value, indicating that the target file is a file to be cleaned, and executing the file processing module to be cleaned;
and the file processing module to be cleaned is used for deleting the files to be cleaned in each target cleaning storage library.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311275018.8A CN117331501B (en) | 2023-09-28 | 2023-09-28 | Data analysis management method, equipment and system for solid state disk |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311275018.8A CN117331501B (en) | 2023-09-28 | 2023-09-28 | Data analysis management method, equipment and system for solid state disk |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117331501A true CN117331501A (en) | 2024-01-02 |
CN117331501B CN117331501B (en) | 2024-06-07 |
Family
ID=89278511
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311275018.8A Active CN117331501B (en) | 2023-09-28 | 2023-09-28 | Data analysis management method, equipment and system for solid state disk |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117331501B (en) |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0652283A (en) * | 1992-07-29 | 1994-02-25 | Matsushita Electric Ind Co Ltd | Electronic filing device |
JP2006072856A (en) * | 2004-09-03 | 2006-03-16 | Rikogaku Shinkokai | Setting data generation program for file inspection, and system |
US20070094257A1 (en) * | 2005-10-25 | 2007-04-26 | Kathy Lankford | File management |
US20090192979A1 (en) * | 2008-01-30 | 2009-07-30 | Commvault Systems, Inc. | Systems and methods for probabilistic data classification |
CN101635651A (en) * | 2009-08-31 | 2010-01-27 | 杭州华三通信技术有限公司 | Method, system and device for managing network log data |
WO2016184199A1 (en) * | 2015-05-15 | 2016-11-24 | 中兴通讯股份有限公司 | File management method, equipment and system |
US20190114332A1 (en) * | 2017-10-18 | 2019-04-18 | Quantum Corporation | Automated storage tier copy expiration |
CN115174205A (en) * | 2022-07-01 | 2022-10-11 | 武汉轩游嘟嘟信息咨询有限公司 | Network space safety real-time monitoring method, system and computer storage medium |
CN115630173A (en) * | 2022-09-08 | 2023-01-20 | 武汉谆教教育咨询中心 | User data management method based on interestingness analysis |
-
2023
- 2023-09-28 CN CN202311275018.8A patent/CN117331501B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0652283A (en) * | 1992-07-29 | 1994-02-25 | Matsushita Electric Ind Co Ltd | Electronic filing device |
JP2006072856A (en) * | 2004-09-03 | 2006-03-16 | Rikogaku Shinkokai | Setting data generation program for file inspection, and system |
US20070094257A1 (en) * | 2005-10-25 | 2007-04-26 | Kathy Lankford | File management |
US20090192979A1 (en) * | 2008-01-30 | 2009-07-30 | Commvault Systems, Inc. | Systems and methods for probabilistic data classification |
CN101635651A (en) * | 2009-08-31 | 2010-01-27 | 杭州华三通信技术有限公司 | Method, system and device for managing network log data |
WO2016184199A1 (en) * | 2015-05-15 | 2016-11-24 | 中兴通讯股份有限公司 | File management method, equipment and system |
US20190114332A1 (en) * | 2017-10-18 | 2019-04-18 | Quantum Corporation | Automated storage tier copy expiration |
CN115174205A (en) * | 2022-07-01 | 2022-10-11 | 武汉轩游嘟嘟信息咨询有限公司 | Network space safety real-time monitoring method, system and computer storage medium |
CN115630173A (en) * | 2022-09-08 | 2023-01-20 | 武汉谆教教育咨询中心 | User data management method based on interestingness analysis |
Also Published As
Publication number | Publication date |
---|---|
CN117331501B (en) | 2024-06-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111538642B (en) | Abnormal behavior detection method and device, electronic equipment and storage medium | |
US8396840B1 (en) | System and method for targeted consistency improvement in a distributed storage system | |
US9183242B1 (en) | Analyzing frequently occurring data items | |
US20160307113A1 (en) | Large-scale batch active learning using locality sensitive hashing | |
US8468134B1 (en) | System and method for measuring consistency within a distributed storage system | |
CN111258593B (en) | Application program prediction model building method and device, storage medium and terminal | |
US20170351717A1 (en) | Column weight calculation for data deduplication | |
CN110389874B (en) | Method and device for detecting log file abnormity | |
CN111125658B (en) | Method, apparatus, server and storage medium for identifying fraudulent user | |
CN112951311A (en) | Hard disk fault prediction method and system based on variable weight random forest | |
Xu et al. | General feature selection for failure prediction in large-scale SSD deployment | |
CN112084330A (en) | Incremental relation extraction method based on course planning meta-learning | |
CN116841779A (en) | Abnormality log detection method, abnormality log detection device, electronic device and readable storage medium | |
CN117331501B (en) | Data analysis management method, equipment and system for solid state disk | |
CN113778964A (en) | Recording device for storing multiple temporary storage files and management method of temporary storage files | |
CN114969738B (en) | Interface abnormal behavior monitoring method, system, device and storage medium | |
CN114722081B (en) | Streaming data time sequence transmission method and system based on transfer library mode | |
CN113032575B (en) | Document blood relationship mining method and device based on topic model | |
CN113553398B (en) | Search word correction method, search word correction device, electronic equipment and computer storage medium | |
CN115345600A (en) | RPA flow generation method and device | |
CN111428576B (en) | Feature information learning method, electronic device and storage medium | |
CN113723436A (en) | Data processing method and device, computer equipment and storage medium | |
CN117076387B (en) | Quick gear restoration system for mass small files based on magnetic tape | |
US11429646B2 (en) | Non-transitory computer-readable storage medium storing information presentation program, information presentation device, and information presentation method of controlling to display information regarding trouble shooting | |
US20230325366A1 (en) | System and method for entity disambiguation for customer relationship management |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |