CN117331501A - Data analysis management method, equipment and system for solid state disk - Google Patents

Data analysis management method, equipment and system for solid state disk Download PDF

Info

Publication number
CN117331501A
CN117331501A CN202311275018.8A CN202311275018A CN117331501A CN 117331501 A CN117331501 A CN 117331501A CN 202311275018 A CN202311275018 A CN 202311275018A CN 117331501 A CN117331501 A CN 117331501A
Authority
CN
China
Prior art keywords
file
target
cleaning
storage library
files
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311275018.8A
Other languages
Chinese (zh)
Other versions
CN117331501B (en
Inventor
李楚龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Jubang Technology Co ltd
Original Assignee
Shenzhen Jubang Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Jubang Technology Co ltd filed Critical Shenzhen Jubang Technology Co ltd
Priority to CN202311275018.8A priority Critical patent/CN117331501B/en
Publication of CN117331501A publication Critical patent/CN117331501A/en
Application granted granted Critical
Publication of CN117331501B publication Critical patent/CN117331501B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0608Saving storage space on storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • G06F16/162Delete operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0643Management of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0652Erasing, e.g. deleting, data cleaning, moving of data to a wastebasket
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0679Non-volatile semiconductor memory device, e.g. flash memory, one time programmable memory [OTP]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of data analysis management of solid state disks, and particularly discloses a method, equipment and a system for data analysis management of a solid state disk, wherein the method comprises file classification matching, file storage library cleaning requirement analysis, file cleaning adaptation analysis and file processing to be cleaned; according to the method, the files are classified, the cleaning requirement trend indexes corresponding to the file storage libraries are analyzed by combining the byte numbers of the files and the file growth trend, the cleaning adaptation indexes are analyzed for the target files, and finally the files to be cleaned are processed to release the memory of the solid state disk in the target computer, so that the problem of limitation in analysis and management of file data at present is effectively solved, multi-dimensional analysis of the files to be cleaned is realized, the memory of the solid state disk can meet the requirements of users, and meanwhile, enough space is reserved for storing backup data, and the possibility of data loss and incapability of recovering the backup data is reduced.

Description

Data analysis management method, equipment and system for solid state disk
Technical Field
The invention relates to the technical field of data analysis and management of solid state disks, in particular to a method, equipment and a system for data analysis and management of a solid state disk.
Background
The solid state disk is a hard disk made of a solid state electronic memory chip array, and is composed of a control unit, a memory unit and a buffer unit, the solid state disk on a computer can be used as a system disk to accelerate the starting speed of the system and the application, and also can be used as a memory disk to provide faster data access speed, however, when the memory of the solid state disk is insufficient, there is insufficient space for storing backup data, which may cause data loss and failure in recovering the backup data, so that analysis and management are required to be performed on file data in the solid state disk of the computer to release the memory of the solid state disk, thereby ensuring safe and stable operation of the solid state disk of the computer.
The existing analysis and management of file data in a solid state disk of a computer mainly analyzes the byte number and access times of the file, and obviously, the existing analysis and management of the file data has the following problems: 1. only a single file is analyzed, the files are not classified, and the file storage library is subjected to cleaning requirement trend analysis without combining with file growth trend, so that the increase situation of the files in the file storage library cannot be intuitively displayed, the accuracy of cleaning requirement trend confirmation of the file storage library is reduced, and further, the memory release effect of the solid state disk in the target computer is not obvious.
2. Only the access times and the modification times of the files are considered, the frequent access conditions and the frequent modification conditions of the files are not considered, the consideration level is not comprehensive enough, the cleaning adaptation conditions of all the target files cannot be accurately known, larger errors exist in the cleaning adaptation index analysis corresponding to all the target files, and the credibility of the cleaning adaptation analysis corresponding to all the target files cannot be improved.
Disclosure of Invention
Aiming at the problems, the invention aims to provide a method, equipment and a system for data analysis management of a solid state disk, which effectively solve the problems mentioned in the background art.
The technical scheme adopted for solving the technical problems is as follows: in a first aspect, the present invention provides a method for data analysis and management of a solid state disk, including the following steps: s1, file classification matching: and extracting the file names of the files in the target computer, constructing each file storage library, and classifying and matching the files according to the file names of the files so as to store the files into the corresponding file storage libraries.
S2, analyzing cleaning requirements of a file repository: and (3) extracting the number of accommodated bytes of the solid state disk in the target computer, extracting the number of storage files and the number of access times corresponding to each monitoring day of each file storage library and the number of bytes of each file in each file storage library, analyzing the cleaning demand trend index corresponding to each file storage library, and when the cleaning demand trend index corresponding to a certain file storage library is greater than or equal to a set value, marking the file storage library as the target cleaning storage library, and executing the step (S3).
S3, file cleaning adaptation analysis: and marking each file in each target cleaning storage library as each target file, extracting accumulated storage time length and operation information of each target file, analyzing cleaning adaptation indexes corresponding to each target file, indicating that a certain target file is a file to be cleaned when the cleaning adaptation index corresponding to the target file is greater than or equal to a set value, and executing the step S4.
S4, processing the file to be cleaned: and deleting each file to be cleaned in each target cleaning storage library.
Specifically, the specific implementation process of storing each file into the corresponding file repository is as follows: a1, extracting text information from file names of the files.
A2, word segmentation is carried out on the extracted text information corresponding to each file, and each phrase corresponding to each file is obtained.
And A3, matching each phrase corresponding to each file with a keyword library associated with each file storage library stored in the cloud database, and if a phrase corresponding to a certain file is positioned in a certain keyword library, taking the file storage library associated with the keyword library as a file storage library of the file, so that each file is stored in a corresponding file storage library.
Specifically, the analyzing the cleaning demand trend index corresponding to each file repository includes the following specific analysis processes: b1, calculating a file growth trend evaluation index beta of each file storage library according to the number of storage files corresponding to each monitoring day of each file storage library i Where i denotes the number of the file repository, i=1, 2,..n.
B2, accumulating the access times of each file repository corresponding to each monitoring day to obtain each fileThe total number of accesses to the repository is denoted as eta i
B3, accumulating the byte numbers of the files in the file storage libraries to obtain the comprehensive file byte number of the file storage libraries, and marking the comprehensive file byte number as epsilon i
B4, recording the number of the accommodation bytes of the solid state disk in the target computer as epsilon Total (S)
B5, calculating a cleaning demand trend index χ corresponding to each file repository iWherein beta', tau 1 And eta' respectively represent the file growth trend evaluation index, the comprehensive file byte number ratio and the total access times of the set reference, a 1 、a 2 And a 3 And respectively representing the set file growth trend evaluation index, the comprehensive file byte number occupation ratio and the cleaning demand trend index evaluation occupation ratio weight corresponding to the total access times, wherein e represents a natural constant.
Specifically, the calculating the file growth trend evaluation index of each file repository comprises the following specific calculating processes: c1, constructing a file growth curve of each file storage library by taking a monitoring day as an abscissa and the number of stored files as an ordinate, positioning a slope value from the curve, and marking the slope value as a file growth rate of each file storage library as K i
C2, setting file growth rate correction factor lambda of each file repository i
C3, calculating the file growth trend evaluation index beta of each file repository iWherein K is i ' indicates the file growth rate of the ith file repository to which the reference is set.
Specifically, the setting of the file growth rate correction factor of each file repository includes the following specific setting processes: d1, taking the starting point of the file growth curve of each file storage library as a base point, taking the file growth rate of a set reference as a slope, and the files in each file storage libraryConstructing a reference datum line in the growth curve, locating the number of monitoring days below the reference datum line from the file growth curve of each file repository, and recording the number of monitoring days as the deviation number as M i
D2, locating the amplitude of the file growth curve from the file growth curves of the file stores, and marking as H i
D3, setting file growth rate correction factor lambda of each file repository iWherein M 'and H' respectively represent the deviation number of the set reference and the amplitude of the file growth curve, a 4 And a 5 The set deviation number and the amplitude deviation of the file growth curve are respectively represented to correspond to the file growth rate correction factor evaluation duty ratio weight.
Specifically, the operation information includes a time point corresponding to each access and a time point corresponding to each modification.
Specifically, the cleaning adaptation index corresponding to each target file is analyzed, and the specific analysis process is as follows: and E1, extracting a time point corresponding to each access of each target file and a time point corresponding to each modification from the operation information.
E2, according to the corresponding time points of each access of each target file, calculating the access frequency index delta of each target file j Where j represents the number of the target file, j=1, 2,..m.
E3, according to the calculation mode of the access frequency index of each target file, the modification frequency index omega of each target file is calculated in the same way j
E4, marking the accumulated storage time of each target file as
E5, calculating the cleaning adaptation index corresponding to each target file Wherein, delta ', omega' and T Storing the articles Respectively showing access frequency index, modification frequency index and accumulated storage duration of setting reference, a) 6 、a 7 And a 8 And respectively representing the set access frequency index, the set modification frequency index and the corresponding cleaning adaptation index evaluation duty ratio weight of the accumulated storage duration.
Specifically, the calculating the access frequency index of each target file includes the following specific calculating processes: and F1, comparing the time points corresponding to the accesses of the target files to obtain the time intervals corresponding to the accesses of the target files.
F2, comparing the time interval corresponding to each access of each target file with the access time interval of the set reference, if the time interval corresponding to a certain access of a certain target file is smaller than the access time interval of the set reference, marking the access as target access, counting the target access times of each target file, and marking as rho j
F3, extracting the minimum value from the time interval corresponding to each access of each target file, and marking as T j
F4, calculating access frequency index delta of each target file jWherein ρ 'and T' respectively represent the target access times and access time intervals of the set reference, b 1 And b 2 And respectively representing the set access frequency index evaluation duty ratio weight corresponding to the target access times and the access time intervals.
A second aspect of the invention proposes an apparatus comprising: processor, memory and communication bus.
The memory has stored thereon a computer readable program executable by the processor.
The communication bus enables connection communication between the processor and the memory.
The steps in a method for data analysis and management of a solid state disk according to any one of claims 1 to 8 are implemented when the processor executes the computer readable program.
A third aspect of the present invention provides a data analysis management system for a solid state disk, including: the file classification matching module is used for extracting the file names of the files in the target computer, constructing each file storage library, and classifying and matching the files according to the file names of the files so as to store the files into the corresponding file storage libraries.
And the cloud database is used for storing keyword libraries associated with each file storage library.
The file storage library cleaning demand analysis module is used for extracting the number of accommodated bytes of the solid state disk in the target computer, extracting the number of storage files and the number of access times corresponding to each monitoring day of each file storage library and the number of bytes of each file in each file storage library, analyzing the cleaning demand trend index corresponding to each file storage library, and when the cleaning demand trend index corresponding to a certain file storage library is greater than or equal to a set value, marking the file storage library as the target cleaning storage library and executing the file cleaning adaptation analysis module.
The file cleaning adaptation analysis module is used for marking each file in each target cleaning storage library as each target file, extracting accumulated storage time length and operation information of each target file, so as to analyze cleaning adaptation indexes corresponding to each target file, and when the cleaning adaptation index corresponding to a certain target file is greater than or equal to a set value, indicating that the target file is a file to be cleaned, and executing the file processing module to be cleaned.
And the file processing module to be cleaned is used for deleting the files to be cleaned in each target cleaning storage library.
Compared with the prior art, the invention has the following advantages and positive effects: (1) According to the method, the files are classified, the cleaning requirement trend indexes corresponding to the file storage libraries are analyzed by combining the byte numbers of the files and the file growth trend, the cleaning adaptation indexes are analyzed for the target files, and finally the files to be cleaned are processed to release the memory of the solid state disk in the target computer, so that the problem of limitation in analysis and management of file data at present is effectively solved, multi-dimensional analysis of the files to be cleaned is realized, the memory of the solid state disk can meet the requirements of users, and meanwhile, enough space is reserved for storing backup data, and the possibility of data loss and incapability of recovering the backup data is reduced.
(2) According to the method, the file growth trend evaluation index of each file storage library is calculated according to the number of storage files corresponding to each monitoring day of each file storage library, so that the cleaning demand trend index corresponding to each file storage library is calculated, the increase condition of the files in the file storage library is intuitively displayed, the accuracy of the cleaning demand trend confirmation of the file storage library is improved, and the memory release effect of the solid state disk in the target computer is further improved.
(3) According to the method and the device, the access frequency index and the modification frequency index of each target file are calculated according to the access times, the time points corresponding to each access and the time points corresponding to each modification of each target file, so that the cleaning adaptation index corresponding to each target file is analyzed, the consideration level is comprehensive, the coverage of cleaning adaptation analysis of each target file is expanded, the cleaning adaptation condition of each file is accurately known, the error of the cleaning adaptation analysis corresponding to each target file is reduced, the reliability of the cleaning adaptation analysis corresponding to each target file is improved, and meanwhile, a reliable decision basis is provided for the processing of the files to be cleaned.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of the method of the present invention.
Fig. 2 is a system module connection diagram of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1
Referring to fig. 1, a method for data analysis and management of a solid state disk includes the following steps: s1, file classification matching: and extracting the file names of the files in the target computer, constructing each file storage library, and classifying and matching the files according to the file names of the files so as to store the files into the corresponding file storage libraries.
The file names of the files are extracted from the system of the target computer.
In a specific embodiment of the present invention, the specific implementation process of storing each file in the corresponding file repository is as follows: a1, extracting text information from file names of the files.
A2, word segmentation is carried out on the extracted text information corresponding to each file, and each phrase corresponding to each file is obtained.
And A3, matching each phrase corresponding to each file with a keyword library associated with each file storage library stored in the cloud database, and if a phrase corresponding to a certain file is positioned in a certain keyword library, taking the file storage library associated with the keyword library as a file storage library of the file, so that each file is stored in a corresponding file storage library.
S2, analyzing cleaning requirements of a file repository: and (3) extracting the number of accommodated bytes of the solid state disk in the target computer, extracting the number of storage files and the number of access times corresponding to each monitoring day of each file storage library and the number of bytes of each file in each file storage library, analyzing the cleaning demand trend index corresponding to each file storage library, and when the cleaning demand trend index corresponding to a certain file storage library is greater than or equal to a set value, marking the file storage library as the target cleaning storage library, and executing the step (S3).
The number of bytes contained in the solid state disk, the number of stored files and the number of accesses corresponding to each monitoring day in each file storage library, and the number of bytes in each file storage library are all extracted from the system of the target computer.
In a specific embodiment of the present invention, the analyzing the cleaning requirement trend index corresponding to each file repository includes: b1, calculating a file growth trend evaluation index beta of each file storage library according to the number of storage files corresponding to each monitoring day of each file storage library i Where i denotes the number of the file repository, i=1, 2,..n.
In a specific embodiment of the present invention, the calculating the file growth trend evaluation index of each file repository includes: c1, constructing a file growth curve of each file storage library by taking a monitoring day as an abscissa and the number of stored files as an ordinate, positioning a slope value from the curve, and marking the slope value as a file growth rate of each file storage library as K i
C2, setting file growth rate correction factor lambda of each file repository i
In a specific embodiment of the present invention, the setting of the file growth rate correction factor of each file repository includes the following specific setting process: d1, taking the starting point of the file growth curve of each file storage library as a base point, taking the file growth rate of a set reference as a slope, constructing a reference datum line in the file growth curve of each file storage library, positioning the number of monitoring days below the reference datum line from the file growth curve of each file storage library, taking the number of monitoring days as the deviation number, and recording as M i
D2, locating the amplitude of the file growth curve from the file growth curves of the file stores, and marking as H i
D3, setting file growth rate correction factor lambda of each file repository iWherein M 'and H' are dividedIndicating the deviation number of the set reference and the amplitude of the file growth curve, a respectively 4 And a 5 The set deviation number and the amplitude deviation of the file growth curve are respectively represented to correspond to the file growth rate correction factor evaluation duty ratio weight.
C3, calculating the file growth trend evaluation index beta of each file repository iWherein K is i ' indicates the file growth rate of the ith file repository to which the reference is set.
B2, accumulating the access times of each file storage library corresponding to each monitoring day to obtain the total access times of each file storage library, and marking the total access times as eta i
B3, accumulating the byte numbers of the files in the file storage libraries to obtain the comprehensive file byte number of the file storage libraries, and marking the comprehensive file byte number as epsilon i
B4, recording the number of the accommodation bytes of the solid state disk in the target computer as epsilon Total (S)
B5, calculating a cleaning demand trend index χ corresponding to each file repository iWherein beta', tau 1 And eta' respectively represent the file growth trend evaluation index, the comprehensive file byte number ratio and the total access times of the set reference, a 1 、a 2 And a 3 And respectively representing the set file growth trend evaluation index, the comprehensive file byte number occupation ratio and the cleaning demand trend index evaluation occupation ratio weight corresponding to the total access times, wherein e represents a natural constant.
According to the embodiment of the invention, the file growth trend evaluation index of each file storage library is calculated according to the number of the storage files corresponding to each monitoring day of each file storage library, so that the cleaning demand trend index corresponding to each file storage library is calculated, the increase condition of the files in the file storage library is intuitively displayed, the accuracy of the cleaning demand trend confirmation of the file storage library is improved, and the memory release effect of the solid state disk in the target computer is further improved.
S3, file cleaning adaptation analysis: and marking each file in each target cleaning storage library as each target file, extracting accumulated storage time length and operation information of each target file, analyzing cleaning adaptation indexes corresponding to each target file, indicating that a certain target file is a file to be cleaned when the cleaning adaptation index corresponding to the target file is greater than or equal to a set value, and executing the step S4.
In a specific embodiment of the present invention, the operation information includes a time point corresponding to each access and a time point corresponding to each modification.
It should be noted that, the accumulated storage duration of each target file, the time point corresponding to each access, and the time point corresponding to each modification are all extracted from the system of the target computer.
In a specific embodiment of the present invention, the analyzing the cleaning adaptation index corresponding to each target file includes: and E1, extracting a time point corresponding to each access of each target file and a time point corresponding to each modification from the operation information.
E2, according to the corresponding time points of each access of each target file, calculating the access frequency index delta of each target file j Where j represents the number of the target file, j=1, 2,..m.
In a specific embodiment of the present invention, the calculating the access frequency index of each target file includes: and F1, comparing the time points corresponding to the accesses of the target files to obtain the time intervals corresponding to the accesses of the target files.
F2, comparing the time interval corresponding to each access of each target file with the access time interval of the set reference, if the time interval corresponding to a certain access of a certain target file is smaller than the access time interval of the set reference, marking the access as target access, counting the target access times of each target file, and marking as rho j
F3, extracting the minimum value from the time interval corresponding to each access of each target file, and marking as T j
F4, calculating access frequency index delta of each target file jWherein ρ 'and T' respectively represent the target access times and access time intervals of the set reference, b 1 And b 2 And respectively representing the set access frequency index evaluation duty ratio weight corresponding to the target access times and the access time intervals.
E3, according to the calculation mode of the access frequency index of each target file, the modification frequency index omega of each target file is calculated in the same way j
It should be noted that, the calculating the modification frequent index of each target file specifically includes: and G1, comparing the time points corresponding to the modifications of each target file to obtain the time interval corresponding to the modifications of each target file.
G2, comparing the time interval corresponding to each modification of each target file with the modification time interval of the set reference, if the time interval corresponding to a certain modification of a certain target file is smaller than the modification time interval of the set reference, marking the modification as target modification, counting the target modification times of each target file, and marking the target modification times as sigma j
G3, extracting the minimum value from the time interval corresponding to each modification of each target file, and marking as T j ′。
G4, calculating the modification frequent index omega of each target file jWherein σ 'and T' represent the target modification times and modification time intervals, b, respectively, of the set reference 3 And b 4 And respectively representing the set target modification times and modification time intervals and correspondingly modifying the frequent index evaluation duty ratio weight.
E4, marking the accumulated storage time of each target file as
E5, calculating the cleaning adaptation index corresponding to each target file Wherein, delta ', omega' and T Storing the articles Respectively showing access frequency index, modification frequency index and accumulated storage duration of setting reference, a) 6 、a 7 And a 8 And respectively representing the set access frequency index, the set modification frequency index and the corresponding cleaning adaptation index evaluation duty ratio weight of the accumulated storage duration.
According to the method and the device, the access frequency index and the modification frequency index of each target file are calculated according to the access times, the time points corresponding to each access and the time points corresponding to each modification of each target file, so that the cleaning adaptation index corresponding to each target file is analyzed, the coverage of cleaning adaptation analysis of each target file is expanded in consideration of the comprehensive level, the cleaning adaptation condition of each file is accurately known, the error of the cleaning adaptation analysis corresponding to each target file is reduced, the reliability of the cleaning adaptation analysis corresponding to each target file is improved, and meanwhile, a reliable decision basis is provided for the processing of the files to be cleaned.
S4, processing the file to be cleaned: and deleting each file to be cleaned in each target cleaning storage library.
According to the embodiment of the invention, the files are classified, the cleaning requirement trend indexes corresponding to the file storage libraries are analyzed by combining the byte numbers of the files and the file growth trend, the cleaning adaptation indexes are analyzed for each target file, and finally the files to be cleaned are processed to release the memory of the solid state disk in the target computer, so that the problem of limitation in the current analysis and management of the file data is effectively solved, the multi-dimensional analysis of the confirmation of the files to be cleaned is realized, the memory of the solid state disk can meet the requirements of users, and meanwhile, enough space is reserved for storing backup data, and the possibility of data loss and incapability of recovering the backup data is reduced.
Example 2
The invention proposes an apparatus comprising: processor, memory and communication bus.
The memory has stored thereon a computer readable program executable by the processor.
The communication bus enables connection communication between the processor and the memory.
The steps in the data analysis management method of any one of the solid state disks are implemented when the processor executes the computer readable program.
Example 3
Referring to fig. 2, the present invention provides a data analysis management system for a solid state disk, including: the system comprises a file classification matching module, a cloud database, a file repository cleaning demand analysis module, a file cleaning adaptation analysis module and a file processing module to be cleaned.
The file classification matching module is connected with the cloud database and the file storage library cleaning demand analysis module, the file storage library cleaning demand analysis module is connected with the file cleaning adaptation analysis module, and the file storage library cleaning demand analysis module and the file cleaning adaptation analysis module are connected with the file processing module to be cleaned.
The file classification matching module is used for extracting the file names of the files in the target computer, constructing each file storage library, and classifying and matching the files according to the file names of the files so as to store the files into the corresponding file storage libraries.
And the cloud database is used for storing keyword libraries associated with each file storage library.
The file storage library cleaning demand analysis module is used for extracting the number of accommodated bytes of the solid state disk in the target computer, extracting the number of storage files and the number of access times corresponding to each monitoring day of each file storage library and the number of bytes of each file in each file storage library, analyzing the cleaning demand trend index corresponding to each file storage library, and when the cleaning demand trend index corresponding to a certain file storage library is greater than or equal to a set value, marking the file storage library as the target cleaning storage library and executing the file cleaning adaptation analysis module.
The file cleaning adaptation analysis module is used for marking each file in each target cleaning storage library as each target file, extracting accumulated storage time length and operation information of each target file, so as to analyze cleaning adaptation indexes corresponding to each target file, and when the cleaning adaptation index corresponding to a certain target file is greater than or equal to a set value, indicating that the target file is a file to be cleaned, and executing the file processing module to be cleaned.
The to-be-cleaned file processing module is used for deleting each to-be-cleaned file in each target cleaning storage library.
The foregoing is merely illustrative and explanatory of the principles of this invention, as various modifications and additions may be made to the specific embodiments described, or similar arrangements may be substituted by those skilled in the art, without departing from the principles of this invention or beyond the scope of this invention as defined in the claims.

Claims (10)

1. The data analysis management method of the solid state disk is characterized by comprising the following steps of:
s1, file classification matching: extracting file names of all files in a target computer, constructing all file storage libraries, and classifying and matching all files according to the file names of all files so as to store all files into corresponding file storage libraries;
s2, analyzing cleaning requirements of a file repository: extracting the number of accommodated bytes of a solid state disk in a target computer, extracting the number of storage files and the number of access times corresponding to each monitoring day of each file storage library and the number of bytes of each file in each file storage library, analyzing the cleaning demand trend index corresponding to each file storage library, and when the cleaning demand trend index corresponding to a certain file storage library is greater than or equal to a set value, marking the file storage library as a target cleaning storage library, and executing the step S3;
s3, file cleaning adaptation analysis: recording each file in each target cleaning storage library as each target file, extracting accumulated storage time length and operation information of each target file, analyzing cleaning adaptation indexes corresponding to each target file, indicating that a certain target file is a file to be cleaned when the cleaning adaptation index corresponding to the target file is greater than or equal to a set value, and executing the step S4;
s4, processing the file to be cleaned: and deleting each file to be cleaned in each target cleaning storage library.
2. The method for data analysis and management of a solid state disk according to claim 1, wherein the method comprises the steps of: the specific implementation process of storing each file into the corresponding file storage library is as follows:
a1, extracting text information from file names of all files;
a2, word segmentation is carried out on the extracted text information corresponding to each file, and each phrase corresponding to each file is obtained;
and A3, matching each phrase corresponding to each file with a keyword library associated with each file storage library stored in the cloud database, and if a phrase corresponding to a certain file is positioned in a certain keyword library, taking the file storage library associated with the keyword library as a file storage library of the file, so that each file is stored in a corresponding file storage library.
3. The method for data analysis and management of a solid state disk according to claim 1, wherein the method comprises the steps of: the cleaning demand trend indexes corresponding to the file storage libraries are analyzed, and the specific analysis process is as follows:
b1, calculating a file growth trend evaluation index beta of each file storage library according to the number of storage files corresponding to each monitoring day of each file storage library i Where i represents the number of the file repository, i=1, 2, n;
b2, accumulating the access times of each file storage library corresponding to each monitoring day to obtain the total access times of each file storage library, and marking the total access times as eta i
B3, accumulating the byte numbers of the files in the file storage libraries to obtain the comprehensive file words of the file storage librariesThe number of nodes is denoted epsilon i
B4, recording the number of the accommodation bytes of the solid state disk in the target computer as epsilon Total (S)
B5, calculating a cleaning demand trend index χ corresponding to each file repository iWherein beta', tau 1 And eta' respectively represent the file growth trend evaluation index, the comprehensive file byte number ratio and the total access times of the set reference, a 1 、a 2 And a 3 And respectively representing the set file growth trend evaluation index, the comprehensive file byte number occupation ratio and the cleaning demand trend index evaluation occupation ratio weight corresponding to the total access times, wherein e represents a natural constant.
4. The method for data analysis and management of a solid state disk according to claim 3, wherein the method comprises the steps of: the method comprises the following specific calculation processes of:
c1, constructing a file growth curve of each file storage library by taking a monitoring day as an abscissa and the number of stored files as an ordinate, positioning a slope value from the curve, and marking the slope value as a file growth rate of each file storage library as K i
C2, setting file growth rate correction factor lambda of each file repository i
C3, calculating the file growth trend evaluation index beta of each file repository iWherein, K' i The file growth rate of the ith file repository to which the reference is set is shown.
5. The method for data analysis and management of a solid state disk according to claim 4, wherein the method comprises the steps of: the file growth rate correction factors of the file stores are set, and the specific setting process is as follows:
d1, taking the starting point of the file growth curve of each file storage library as a base point, taking the file growth rate of a set reference as a slope, constructing a reference datum line in the file growth curve of each file storage library, positioning the number of monitoring days below the reference datum line from the file growth curve of each file storage library, taking the number of monitoring days as the deviation number, and recording as M i
D2, locating the amplitude of the file growth curve from the file growth curves of the file stores, and marking as H i
D3, setting file growth rate correction factor lambda of each file repository iWherein M 'and H' respectively represent the deviation number of the set reference and the amplitude of the file growth curve, a 4 And a 5 The set deviation number and the amplitude deviation of the file growth curve are respectively represented to correspond to the file growth rate correction factor evaluation duty ratio weight.
6. The method for data analysis and management of a solid state disk according to claim 1, wherein the method comprises the steps of: the operation information comprises a time point corresponding to each access and a time point corresponding to each modification.
7. The method for data analysis and management of a solid state disk according to claim 6, wherein the method comprises the steps of: the cleaning adaptation index corresponding to each target file is analyzed, and the specific analysis process is as follows:
e1, extracting a time point corresponding to each access of each target file and a time point corresponding to each modification from the operation information;
e2, according to the corresponding time points of each access of each target file, calculating the access frequency index delta of each target file j Where j represents the number of the target file, j=1, 2,..m;
e3, according to the calculation mode of the access frequency index of each target file, the modification frequency index omega of each target file is calculated in the same way j
E4, marking the accumulated storage time of each target file as
E5, calculating the cleaning adaptation index corresponding to each target file Wherein, delta ', omega' and T Storing the articles Respectively showing access frequency index, modification frequency index and accumulated storage duration of setting reference, a) 6 、a 7 And a 8 And respectively representing the set access frequency index, the set modification frequency index and the corresponding cleaning adaptation index evaluation duty ratio weight of the accumulated storage duration.
8. The method for data analysis and management of a solid state disk according to claim 7, wherein the method comprises the steps of: the method comprises the following specific calculation processes of:
f1, comparing time points corresponding to each access of each target file to obtain a time interval corresponding to each access of each target file;
f2, comparing the time interval corresponding to each access of each target file with the access time interval of the set reference, if the time interval corresponding to a certain access of a certain target file is smaller than the access time interval of the set reference, marking the access as target access, counting the target access times of each target file, and marking as rho j
F3, extracting the minimum value from the time interval corresponding to each access of each target file, and marking as T j
F4, calculating access frequency index delta of each target file jWherein ρ 'and T' respectively represent the target access times and access time intervals of the set reference, b 1 And b 2 And respectively representing the set access frequency index evaluation duty ratio weight corresponding to the target access times and the access time intervals.
9. An apparatus, comprising: a processor, a memory, and a communication bus;
the memory has stored thereon a computer readable program executable by the processor;
the communication bus realizes connection communication between the processor and the memory;
the steps in a method for data analysis and management of a solid state disk according to any one of claims 1 to 8 are implemented when the processor executes the computer readable program.
10. The data analysis management system of the solid state disk is characterized by comprising the following components:
the file classification matching module is used for extracting the file names of all files in the target computer, constructing all file storage libraries, and classifying and matching all files according to the file names of all files so as to store all files into the corresponding file storage libraries;
the cloud database is used for storing keyword libraries associated with each file storage library;
the file storage library cleaning requirement analysis module is used for extracting the number of accommodated bytes of the solid state disk in the target computer, extracting the number of storage files and the number of access times corresponding to each monitoring day of each file storage library and the number of bytes of each file in each file storage library, analyzing the cleaning requirement trend index corresponding to each file storage library, and when the cleaning requirement trend index corresponding to a certain file storage library is greater than or equal to a set value, marking the file storage library as the target cleaning storage library and executing the file cleaning adaptation analysis module;
the file cleaning adaptation analysis module is used for marking each file in each target cleaning storage library as each target file, extracting accumulated storage time length and operation information of each target file, so as to analyze cleaning adaptation indexes corresponding to each target file, and when the cleaning adaptation index corresponding to a certain target file is greater than or equal to a set value, indicating that the target file is a file to be cleaned, and executing the file processing module to be cleaned;
and the file processing module to be cleaned is used for deleting the files to be cleaned in each target cleaning storage library.
CN202311275018.8A 2023-09-28 2023-09-28 Data analysis management method, equipment and system for solid state disk Active CN117331501B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311275018.8A CN117331501B (en) 2023-09-28 2023-09-28 Data analysis management method, equipment and system for solid state disk

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311275018.8A CN117331501B (en) 2023-09-28 2023-09-28 Data analysis management method, equipment and system for solid state disk

Publications (2)

Publication Number Publication Date
CN117331501A true CN117331501A (en) 2024-01-02
CN117331501B CN117331501B (en) 2024-06-07

Family

ID=89278511

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311275018.8A Active CN117331501B (en) 2023-09-28 2023-09-28 Data analysis management method, equipment and system for solid state disk

Country Status (1)

Country Link
CN (1) CN117331501B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0652283A (en) * 1992-07-29 1994-02-25 Matsushita Electric Ind Co Ltd Electronic filing device
JP2006072856A (en) * 2004-09-03 2006-03-16 Rikogaku Shinkokai Setting data generation program for file inspection, and system
US20070094257A1 (en) * 2005-10-25 2007-04-26 Kathy Lankford File management
US20090192979A1 (en) * 2008-01-30 2009-07-30 Commvault Systems, Inc. Systems and methods for probabilistic data classification
CN101635651A (en) * 2009-08-31 2010-01-27 杭州华三通信技术有限公司 Method, system and device for managing network log data
WO2016184199A1 (en) * 2015-05-15 2016-11-24 中兴通讯股份有限公司 File management method, equipment and system
US20190114332A1 (en) * 2017-10-18 2019-04-18 Quantum Corporation Automated storage tier copy expiration
CN115174205A (en) * 2022-07-01 2022-10-11 武汉轩游嘟嘟信息咨询有限公司 Network space safety real-time monitoring method, system and computer storage medium
CN115630173A (en) * 2022-09-08 2023-01-20 武汉谆教教育咨询中心 User data management method based on interestingness analysis

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0652283A (en) * 1992-07-29 1994-02-25 Matsushita Electric Ind Co Ltd Electronic filing device
JP2006072856A (en) * 2004-09-03 2006-03-16 Rikogaku Shinkokai Setting data generation program for file inspection, and system
US20070094257A1 (en) * 2005-10-25 2007-04-26 Kathy Lankford File management
US20090192979A1 (en) * 2008-01-30 2009-07-30 Commvault Systems, Inc. Systems and methods for probabilistic data classification
CN101635651A (en) * 2009-08-31 2010-01-27 杭州华三通信技术有限公司 Method, system and device for managing network log data
WO2016184199A1 (en) * 2015-05-15 2016-11-24 中兴通讯股份有限公司 File management method, equipment and system
US20190114332A1 (en) * 2017-10-18 2019-04-18 Quantum Corporation Automated storage tier copy expiration
CN115174205A (en) * 2022-07-01 2022-10-11 武汉轩游嘟嘟信息咨询有限公司 Network space safety real-time monitoring method, system and computer storage medium
CN115630173A (en) * 2022-09-08 2023-01-20 武汉谆教教育咨询中心 User data management method based on interestingness analysis

Also Published As

Publication number Publication date
CN117331501B (en) 2024-06-07

Similar Documents

Publication Publication Date Title
CN111538642B (en) Abnormal behavior detection method and device, electronic equipment and storage medium
US8396840B1 (en) System and method for targeted consistency improvement in a distributed storage system
US9183242B1 (en) Analyzing frequently occurring data items
US20160307113A1 (en) Large-scale batch active learning using locality sensitive hashing
US8468134B1 (en) System and method for measuring consistency within a distributed storage system
CN111258593B (en) Application program prediction model building method and device, storage medium and terminal
US20170351717A1 (en) Column weight calculation for data deduplication
CN110389874B (en) Method and device for detecting log file abnormity
CN111125658B (en) Method, apparatus, server and storage medium for identifying fraudulent user
CN112951311A (en) Hard disk fault prediction method and system based on variable weight random forest
Xu et al. General feature selection for failure prediction in large-scale SSD deployment
CN112084330A (en) Incremental relation extraction method based on course planning meta-learning
CN116841779A (en) Abnormality log detection method, abnormality log detection device, electronic device and readable storage medium
CN117331501B (en) Data analysis management method, equipment and system for solid state disk
CN113778964A (en) Recording device for storing multiple temporary storage files and management method of temporary storage files
CN114969738B (en) Interface abnormal behavior monitoring method, system, device and storage medium
CN114722081B (en) Streaming data time sequence transmission method and system based on transfer library mode
CN113032575B (en) Document blood relationship mining method and device based on topic model
CN113553398B (en) Search word correction method, search word correction device, electronic equipment and computer storage medium
CN115345600A (en) RPA flow generation method and device
CN111428576B (en) Feature information learning method, electronic device and storage medium
CN113723436A (en) Data processing method and device, computer equipment and storage medium
CN117076387B (en) Quick gear restoration system for mass small files based on magnetic tape
US11429646B2 (en) Non-transitory computer-readable storage medium storing information presentation program, information presentation device, and information presentation method of controlling to display information regarding trouble shooting
US20230325366A1 (en) System and method for entity disambiguation for customer relationship management

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant