Disclosure of Invention
In order to overcome the technical defects in the prior art, the invention discloses a data management method for mass dynamic files.
The data management method for the mass dynamic files comprises the following steps:
s1, setting a file storage directory architecture, wherein the file storage directory architecture comprises a plurality of layers of directories starting from a root directory;
configuring a parameter file, including a global deletion marker file under a root directory;
configuring a local deletion mark file under each directory;
configuring a root directory path as a global variable;
configuring the maximum contained subdirectory number maxDirNum of each directory;
configuring the maximum storable file quantity maxFileNum of each directory;
wherein the maximum storable file number for each directory represents files directly stored under the directory, and sub-directory storage files for the directory are not included;
setting a catalog control variable currentDirs as a global variable; the initial value of the directory control variable currentDirs is the root directory; the directory control variable currentDirs is used for storing a directory path from a root directory to a current directory;
setting a file sequence number control variable currentSn as a global variable according to the storage sequence number of the previous newly added file; the initial value of the file sequence number control variable currentSn is 0;
s2, storing the newly added file in the deleted file position, and specifically comprising the following steps:
inquiring whether the global deletion mark file contains the catalog of the deletion file or not, if yes, obtaining a first catalog A1, opening the local deletion mark file of the catalog A1, obtaining a deletion file sequence number list, and storing a file vacancy at a storage position corresponding to the first file sequence number in the deletion file sequence number list;
otherwise, step S3 is carried out;
s3, storing the newly added file into the current catalog, wherein the method comprises the following specific steps of:
checking whether the current catalog is full or not, namely whether the file sequence number control variable currentSn is smaller than the maximum storable file quantity maxFileNum or not, if the file sequence number control variable currentSn is smaller than the maximum storable file quantity maxFileNum, indicating that the current catalog is not full, entering step S31, otherwise entering step S4;
s31, updating a file sequence number control variable currentSn to be a current value plus 1, namely updating according to currentSn=currentSn+1; enter S32;
s32, updating a storage sequence number storage file;
s4, newly-added files store subdirectories, and the positions of the subdirectories are determined, wherein the method specifically comprises the following steps:
generating a query directory array dirs according to the directory control variable currentDirs; from the last valid value of the query directory array dirs, looking up step by step, one of the following two cases S41 and S42 is experienced:
s41, when an effective value C1 is smaller than the maximum available subdirectory number maxDarnum, adding a new subdirectory in a directory layer corresponding to the effective value, wherein the subdirectory number is 1, and meanwhile, the effective value C1 is updated to be C1+1;
the last element value of the query directory array dirs is modified to be 1; generating a directory control variable currentDirs according to the dirs; setting a file sequence number control variable currentsn=1;
s42, traversing the directory array dirs, and adding a new directory layer if all the effective values are equal to the maximum subsampled subdirectory number maxTirNum; and creating a new directory at the new directory layer: modifying the last element value of the dirs array to be 1, generating a directory path according to the directory array dirs, and newly building a subdirectory under the directory, wherein the subdirectory number is 1; adding a new element with a value of 1 at the end of the query directory array dirs;
updating a directory control variable currentDirs according to the query directory array dirs, and setting a file sequence number control variable currentSn=1;
according to the global variable root directory path rootPath, synthesizing a full path file name fullfileName of the newly added file by using a directory control variable currentDirs, a file sequence number control variable currentSn and a user file name; returning to the fullFileName;
s5, in the data management method, when the file is deleted, updating the global deleted file mark and the local deleted file mark, and recording and storing the catalogue of the deleted file for the global deleted file mark; and storing the storage serial numbers of the deleted files in the directory where the deleted files are located in the local deletion mark files under the directory.
Preferably, the storing of file slots in the step S2 specifically includes:
opening the local deletion mark file of the catalog A1, obtaining a deletion file sequence number list, taking the first file sequence number, and storing a file vacancy at a corresponding position:
s21, storing a file at a first file sequence number D1 position of a deleted file sequence number list in a local deletion mark file under the catalog A1;
preferably, in the step S6, when deleting the file, a directory where the deleted file is located needs to be recorded in a global delete flag file gdelete. And simultaneously, the serial number of the deleted file is recorded in the local deletion mark file of the directory where the deleted file is located.
The invention realizes the automatic establishment and file management of the multi-layer catalogue based on the hierarchy through a concise data structure and a smart and efficient algorithm, and has the following advantages:
1: all directories except the root directory are named by natural numbers, and directory allocation is performed by integer calculation, so that the method is simple and efficient; each file has a file sequence number of the catalog where the file is located, so that the management of the catalog and the file can be completed through simple integer operation;
2: the global variable is adopted to store the current file catalogue and the file serial number, and the next file storage catalogue and the file serial number can be conveniently calculated through the two variables;
3: the method of combining the global deletion mark file and the local deletion mark file is adopted, so that the data searching and processing efficiency is improved.
Description of the embodiments
The following describes the embodiments of the present invention in further detail with reference to the accompanying drawings.
The data management method for the mass dynamic files comprises two aspects of adding files and deleting files.
The function of adding files is the core function of the invention, and the function determines the storage catalogue of the added files and the serial numbers of the files in the catalogue.
In one embodiment, the new file service may be provided by a function newFile, where the input parameter is a file name without a path, and the file name with a file storage path and a sequence number extension is returned. The specific function of the function newFile is the function of the newly added file.
The new file function comprises the following steps:
s1, setting a file storage directory architecture comprising a plurality of layers of directories starting from a root directory
Configuring a parameter file, including a global deletion mark file gdelete. Json, and recording all the directories with the files deleted;
configuring local deletion mark files ldelete. Json under each directory, and recording the sequence number of the deleted file under the current directory;
configuring a root directory path rootPath;
configuring the maximum contained subdirectory number maxDirNum of each directory;
configuring the maximum storable file quantity maxFileNum of each directory;
setting a directory control variable currentDirs, wherein the initial value can be a root directory;
the file sequence number control variable currentSn is set, and the initial value may be 0.
The root directory is a layer 0 directory, the direct subdirectory of the root directory is a layer 1 directory, and the direct subdirectory of the layer 1 directory is a layer 2 directory; similarly, the direct subdirectory of the nth layer is the n+1 layer directory. The directory hierarchy is shown in fig. 2.
Wherein each directory may contain maxDirNum subdirectories, each subdirectory being named by a number 1,2,3, …, maxDirNum, respectively. The number of files that can be stored per directory is determined by the maximum amount of storage maxFileNum.
For example, one typical system configuration is: parameters are saved in the json format in the configuration file config. Json, as follows:
{
rootPath:d:\attachfile,
maxDirNum: 500,
maxFileNum:500,
currentDirs:\,
currentSn:0
}
rootPath: a root directory path.
maxDirNum: each directory may contain a number of subdirectories, with a default value of 500, which may be modified.
maxFileNum: each directory may store the number of files, with a default value of 500, which may be modified.
Global configuration file global. Json, save global configuration parameters rootPath, maxFileNum, maxDirNum in json format.
The directory control variable currentDirs is the current directory, and is typically formatted as: first-level subdirectory sequence number, second-level directory sequence number, n-level directory sequence number, initial value is root directory: and \is provided.
Directory control variable currentDirs example:
2/: root directory
1: representing a second-layer directory having a second-layer directory number of 1
500/101: a third layer directory is shown, the second layer directory number is 500, the third layer directory number is 101, the second layer directory number 500 indicates that the layer reaches the maximum storable file number maxfilenum=500, i.e. the number of directories is full, and the third layer directory number 101 indicates that the layer does not reach the maximum storable file number maxFileNum, i.e. the number of directories is not full.
The file sequence number control variable currentSn is the file sequence number of the directory where the previous newly added file is located, and is an integer, and the initial value is 0.
S2, inquiring whether the global deletion mark file contains a catalog
The global deletion flag file gdelete. Json records which directories have file deletion in json format, and can store newly added files. One specific format is as follows:
[ "\1\3", "\1\100\31" ] indicates that there are two directories with file deletion, respectively the second-layer subdirectories with the serial number of 3; and a third layer subdirectory, number 31, located under the second layer subdirectory number 100.
gdelete.json is a json string array in which each element is a directory, indicating that there is a file under the directory that is deleted, leaving room, and the newly added file can be saved in the directory.
Each directory has a local delete marker file ldelet. Json, which records the file sequence number deleted under the directory. The ldeete json records each deleted file sequence number in the format of a json array, as follows:
[10,50,106]
the array indicates that under the current directory, three files with sequence numbers 10, 50 and 106 have been deleted, and three files can be accepted to fill in the three gaps.
Setting catalog generation rules: in addition to the root directory, the other directory layers can generate the n+1-th directory only after the number of files stored in the n-th directory reaches the upper limit.
Setting a catalog preservation rule: all files are stored in the directory, and the storage principle is as follows: the newly added file can be stored in the high-level directory only after the number of files in all directories in the low-level reaches maxFileNum.
To avoid the problem of user files overlapping each other because of the same name, the file sequence number stored in the directory may be added to the front of the extension in the file name. For example, the original file name: the introduction. Doc, modified as: the introduction-102.Doc,102 is the file sequence number stored in the directory where the file is located.
When the file is newly added, the file is preferentially stored in a vacancy reserved by the deleted file. The system saves the room left by the deleted file by two types of files: global delete marker file gdelete and local delete marker file ldelet.
Only one global deletion mark file gdelete is used for recording the directory where the deletion files are located; each directory has a file, and the number of the deleted file under the directory is recorded. If gdelete is not empty, then S21 is performed, otherwise S3 is performed.
S21, a deleted file catalog can be quickly obtained from the global deleted mark file gdelete, and a deleted file number is obtained from a corresponding ldelet file according to the catalog, so that a storage catalog and a file number of a newly added file are obtained.
S3, judging whether the current directory file is full
If the file sequence number control variable currentSn is smaller than maxFileNum, which indicates that the current directory is not full, a new file can be added in the current directory, and steps S31 and S32 can be performed. If currentSn is equal to maxFileNum, the current directory has no file storage space, a new directory is needed, and step S4 is executed.
S31, updating a storage sequence number:
updating a file sequence number control variable currentSn to be the current value plus 1, namely updating according to currentSn=currentSn+1; storing a file;
s32, updating a storage sequence number storage file;
and completing the process of adding files in the current directory.
S4, searching the last effective value of the directory array step by step upwards:
the query directory array dirs is generated by the directory control variable currentDirs. The effective value of each element of dirs is equal to the current directory number of the directory of the layer, and the effective value is an integer between 1 and maxTirNum; when the element value of the query directory array is equal to the invalid value, the query directory array indicates that the layer directory is not created;
each element of the query directory array dirs represents a layer of directory numbers.
One specific implementation of querying directory array dirs is:
for example, setting dirs [0] to store the root directory sequence number, and setting the value thereof to be constant to 0; dirs [1] stores the layer 2 directory sequence number; dir 2 stores the layer 3 directory sequence number; similarly, dirs [ n ] stores the n+1st layer directory number.
currentDirs and Dirs array relationship example:
starting from the last element of the query directory array dirs, look up one by one. I.e. starting from dirs [ n ], proceeding to dirs [ n-1], dirs [ n-2], …, until dirs [0], one by one, is examined, one of the following two cases S41 and S42.
S41, if the value C1 of a certain array element dirs [ i ] is larger than 1 and smaller than maxTirnum, adding a new subdirectory in the directory corresponding to the element, wherein the subdirectory sequence number is 1, and meanwhile, C1 is updated to be C1+1 and C1 is kept to dirs [ i ];
inquiring directory array dirs element dirs [ i+1] to modify to 1; generating a directory control variable currentDirs according to the Dirs array; setting a file sequence number control variable currentsn=1;
this step illustrates that the number of subdirectories in a certain directory reaches a maximum value, but the directory number is not maxDirNum, and illustrates that sibling directories in the same layer can add subdirectories, and the directory number of the sibling directory is the directory number of the current directory plus 1.
S42, traversing the query directory array dirs, wherein all the effective values are equal to the maximum subscale number maxTirnum, adding a new directory layer, and creating a new directory in the new directory layer, and the specific method is as follows:
modifying the last element value of the dirs array of the query directory array to be 1, generating a directory path according to the dirs, and newly building a subdirectory under the directory, wherein the subdirectory number is 1; adding an element at the end of dirs, wherein the value of the newly added element is 1;
this step illustrates that all the sub-directory numbers of each established directory layer are full, a new directory layer needs to be created, and the new directory is located under the directory with the directory number 1 of the current level.
In the S41 and S42 steps, a directory control variable currentDirs is generated according to the query directory array dirs.
According to the global variable root directory path rootPath, synthesizing a full path file name fullfileName of the newly added file by using a directory control variable currentDirs, a file sequence number control variable currentSn and a user file name; returning to the fullFileName.
The invention searches possible new catalogs from the lowest-layer catalogs to the high-layer catalogs through the array structure, and realizes the establishment and management of a catalogue system through simple addition operation. By setting each parameter, the storable vacant position and vacant catalogue are automatically searched, the generation of a new catalogue is strictly limited, the phenomena of uneven storage files and unordered catalogue layers caused by random establishment of the new catalogue in the prior storage are avoided, and the data searching efficiency is accelerated by adopting a mode of combining a global file deleting mark and a local file deleting mark.
The foregoing description is that, if the preferred embodiments in the foregoing preferred embodiments are not obviously contradictory or are based on a certain preferred embodiment, the preferred embodiments may be arbitrarily overlapped and combined, and the embodiments and specific parameters in the embodiments are only for clearly describing the invention verification process of the inventor, and are not intended to limit the scope of the invention, and the English in the parentheses in the description is the English name in the computer language or function before the parentheses, the English in the parentheses is the name of the mathematical function or database without strict Chinese translation, etc., and the scope of the invention is still subject to the claims, and all equivalent structural changes made by using the description and the content of the drawings of the invention should be included in the scope of the invention.