CN113806803B - Data storage method, system, terminal equipment and storage medium - Google Patents

Data storage method, system, terminal equipment and storage medium Download PDF

Info

Publication number
CN113806803B
CN113806803B CN202111091089.3A CN202111091089A CN113806803B CN 113806803 B CN113806803 B CN 113806803B CN 202111091089 A CN202111091089 A CN 202111091089A CN 113806803 B CN113806803 B CN 113806803B
Authority
CN
China
Prior art keywords
file
data storage
information
data
directory
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111091089.3A
Other languages
Chinese (zh)
Other versions
CN113806803A (en
Inventor
倪子程
陈奋
陈荣有
孙晓波
龚利军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Fuyun Information Technology Co ltd
Original Assignee
Xiamen Fuyun Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen Fuyun Information Technology Co ltd filed Critical Xiamen Fuyun Information Technology Co ltd
Priority to CN202111091089.3A priority Critical patent/CN113806803B/en
Publication of CN113806803A publication Critical patent/CN113806803A/en
Application granted granted Critical
Publication of CN113806803B publication Critical patent/CN113806803B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6227Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database where protection concerns the structure of data, e.g. records, types, queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioethics (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a data storage method, a system, a terminal device and a storage medium, wherein the system comprises: file list files, directory structure files, and data storage files; the file list file is used for storing file information of the directory structure file and the data storage file and directory root node addresses; the directory structure file and the data storage file comprise file heads, a data area and a summary area, wherein the file heads are used for storing file information and structure information; the abstract area is used for storing the use state of each cluster in the corresponding data area, the number of used effective clusters in the data area of each block and the check code corresponding to the data area of each block; the data area of the directory structure file is used for storing address information of each file node; the data area of the data storage file is used for storing the data information of each file node. The invention adopts the directory tree structure to store the files, thereby greatly optimizing the query traversing speed, reducing the size of the stored files and supporting flexible full-scale synchronization.

Description

Data storage method, system, terminal equipment and storage medium
Technical Field
The present invention relates to the field of file technologies, and in particular, to a data storage method, a system, a terminal device, and a storage medium.
Background
The Web application system is widely applied to important business lines such as social contact, shopping, banking, mail and the like, plays a very important role in network assets, has wide attacked surface and more attack technologies, and is easy to invade.
Network attackers usually utilize vulnerabilities existing in attacked websites to perform activities such as illegal profit-making or malicious business attack by embedding illegal hidden links in web pages to tamper with web page contents. Malicious tampering of a web page may affect the user's normal access to the web page content, and may also result in serious economic, branding, and even political risks.
The common modes of webpage tamper resistance include an externally hung polling technology, a core embedded technology, an event triggering technology and the like, but the common modes include a core which is to calculate hash of each webpage file in advance and store the hash, and when tamper resistant software works, the actual hash of the current webpage file is calculated and compared with the recorded hash according to the need, so as to judge whether the file is tampered or not. There is a need for a storage means that can store information such as hashes quickly and reliably. The traditional storage mode is to select a database to store data, but web page tamper resistance is not suitable for using network databases such as mysql, mssql and the like due to various factors of a working environment, and the sqlite file database is mostly used. Although the sqlite performance satisfies the requirements, the following two problems still exist in practical use.
The sqlite is stored in a table form, and although the index is established by a path, the sqlite is in a table structure in actual use, so that when traversing all data, the sqlite cannot be performed according to an actual file directory structure, and multiple queries and IO operations are added intangibly.
2. In the webpage tamper resistance, hash data are calculated and generated in a safe environment and then are synchronized to a webpage tamper resistance program working on line. There are two ways of synchronization a. incremental synchronization: only the changed information is synchronized in the past each time, and a complex and fine log management mechanism is needed in the method, otherwise, the method is too easy to make errors; b. full synchronization: the synchronization of all hash data is complete, and the number of such methods used in large sites can be large, often requiring hundreds of megabits of data to be synchronized by updating only one piece of data.
Disclosure of Invention
In order to solve the problems, the invention provides a data storage method, a system, a terminal device and a storage medium.
The specific scheme is as follows:
a data storage system, comprising: file list files, directory structure files, and data storage files;
the file list file is used for storing file information and directory root node addresses of directory structure files and data storage files, and the file information comprises file codes, file types and file check codes;
the directory structure file and the data storage file each comprise a file header, a data area and a summary area, wherein:
the file header of the directory structure file is used for storing file information and structure information of the directory structure file; the file header of the data storage file is used for storing file information and structure information of the data storage file; wherein the structure information includes the total number of valid clusters in the data area;
the data area of the directory structure file is used for storing address information of each file node, and the address information consists of directory address information and file address information, wherein: the directory address information comprises the length of directory names, the number of sub-nodes contained in the directory, the address of a higher-level directory node of the directory, the address of each sub-node contained in the directory and the directory names; the file address information comprises the length of a file name, the address of a father node corresponding to the file node, the file name and the address of the corresponding file node of the file node in the data storage file;
the data area of the data storage file is used for storing data information of each file node, and the data information comprises a check code of the file and file storage path information;
the summary areas of the directory structure file and the data storage file are used for storing the use state of each cluster in the corresponding data area, the number of used effective clusters in the data area of each block and the check code corresponding to the data area of each block.
Further, the usage status of each cluster stored in the abstract area of the directory structure file includes four types, which are respectively: unused, directory information, file name, or directory name.
Further, the usage status of each cluster stored in the summary area of the data storage file includes three types, which are respectively: unused, check code of file, file storage path information.
A data storage method, according to an embodiment of the present invention, is a data storage system, including: when a new file node is needed, acquiring data information of the new file node according to a file corresponding to the new file node, and storing the data information corresponding to the new file node into a data area of a data storage file; according to the address and file storage path information stored in the data storage file of the data information corresponding to the newly added file node, acquiring the directory address information and the file address information corresponding to the newly added file node, and newly adding the file address information in the data area of the directory structure file, and updating or newly adding the directory address information.
Further, when information is newly added in the data area of the data storage file or the directory structure file, searching whether the use state of n continuous clusters exists in the corresponding abstract area is unused, n is the number of clusters required by the newly added information, and if so, storing the newly added information into the n continuous clusters in the searched data area; otherwise, a space of a block is newly added in the data storage file or the directory structure file, and newly added information is stored in n continuous clusters in a data area of the newly added space; and after the newly added information is stored in the data area, updating the use states of the corresponding stored n clusters in the abstract area.
A data storage method, according to an embodiment of the present invention, is a data storage system, including: when a file node needs to be deleted, the use state of the cluster corresponding to the file node stored in the abstract areas of the data storage file and the directory structure file is set to be unused, and the actual information stored in the data area is not deleted.
Further, when the file node needs to be deleted, the method further comprises: checking whether the ratio of the total number of the effective clusters of the data area to the total number of all clusters of the data area stored in the file heads of the data storage file and the directory structure file is smaller than a ratio threshold value, if so, defragmenting the data areas of the data storage file and the directory structure file which are smaller than the ratio threshold value, and deleting redundant clusters according to the blocks.
A data storage method, according to an embodiment of the present invention, is a data storage system, including: when whether the file is tampered needs to be judged, the method comprises the following steps of:
s101: calculating file check codes according to the directory structure file and the data storage file, comparing the calculated file check codes with the file check codes stored in the file header, and judging that the file is not tampered when the file check codes are identical to the file check codes: otherwise, enter S102;
s102: and calculating the check code corresponding to the data area of each block according to the directory structure file and the data area of the data storage file, comparing the calculated check code with the check code of the corresponding block stored in the abstract area, and comparing the data areas of the blocks which are different from each other according to bytes one by one to obtain a changed cluster, thereby obtaining the tampered file.
A data storage terminal device comprising a processor, a memory and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the data storage method according to the second embodiment of the invention when the computer program is executed by the processor.
A computer-readable storage medium storing a computer program which, when executed by a processor, implements the steps of the data storage method of the second embodiment of the present invention.
The invention adopts the technical scheme and has the beneficial effects that:
1. the method adopts the directory tree structure to store the files, greatly optimizes the query traversing speed, reduces the size of the stored files and supports flexible full-quantity synchronization.
2. The invention relates to a file structure specially designed for storing file directory data, which combines the characteristics of a sqlite file system and a FAT32 file system.
3. Compared with the existing storage mode of using the sqlite database in the market, the method has the advantages of supporting separate file storage, fast comparison difference, traversing data according to a directory structure, simplifying the data and effectively improving the working efficiency of webpage tamper resistance.
Drawings
Fig. 1 is a schematic diagram of a file list file according to an embodiment of the invention.
Fig. 2 is a schematic diagram of a directory structure file according to a first embodiment of the present invention.
Fig. 3 is a schematic diagram of a storage structure of a node address according to a first embodiment of the present invention.
Fig. 4 is a schematic diagram of a data storage file according to a first embodiment of the present invention.
Detailed Description
For further illustration of the various embodiments, the invention is provided with the accompanying drawings. The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate embodiments and together with the description, serve to explain the principles of the embodiments. With reference to these matters, one of ordinary skill in the art will understand other possible embodiments and advantages of the present invention.
The invention will now be further described with reference to the drawings and detailed description.
Embodiment one:
the embodiment of the invention provides a data storage system, which comprises three types of files, namely: file list files, directory structure files, and data storage files.
(1) The file list file is used for storing file information of the directory structure file and the data storage file and directory root node addresses.
In this embodiment, the file information includes a file code (file ID), a file type, and a file check code, as shown in fig. 1, where the file check code corresponds to one check code for each file, for example, file ID1 and file ID2 respectively correspond to respective file check codes, and in this embodiment, the file check code adopts a CRC32 check code, and in other embodiments, other forms of check codes may be adopted as needed, which is not limited herein.
The file types include both directory structure files and data storage files, denoted 1 and 2, respectively.
(2) The directory structure file is used for storing the file structure of the whole directory under the root node by adopting a corresponding tree structure, and carrying out corresponding sorting.
Referring to fig. 2, the directory structure file includes a file header, a data area, and a digest area.
The header of the directory structure file is used to store file information and structure information of the directory structure file. The file information is the same as the file information in the file list file, including file encoding, file type, and file check code. The file check code herein is an overall check code of the data area of all the blocks included in the entire directory structure file. The configuration information in this embodiment includes the summary start cluster number and the total number of valid clusters in the data area, i.e., the total cluster data contained in the data area minus the number of unused blank clusters.
The data area of the directory structure file is used for storing address information of each file node, and the address information consists of directory address information and file address information, wherein: the directory address information comprises the length of directory names, the number of child nodes contained in the directory, the address of a father node in the directory, the address of each child node contained in the directory and the directory names; the file address information comprises the length of a file name, the address of a father node corresponding to the file node, the file name and the address of the corresponding file node of the file node in the data storage file.
The file node means information for storing files in units of nodes, one file corresponding to one file node.
In this embodiment, the directory name is recorded from 0 bytes of each cluster, and the length of the directory name is used to determine whether the directory name is stored in more than one cluster, and if so, the more than one cluster is stored in the next cluster. The length of the file name and the file name are the same as the length of the directory name and the usage and storage mode of the directory name.
The address of the corresponding node of the file node in the data storage file is: the number of the cluster where the file node is in the data storage file, which is the same as the file node, for example, when the file node is file 1, the number of the cluster where the file node is in the data storage file is 1 to 4, as shown in fig. 4. In this embodiment, the node address is set to be 4 bytes of data, as shown in fig. 3, where the first byte is used to record the file ID, and the last three bytes are used to record the sequence number in the cluster.
The abstract area of the directory structure file is used for storing the use state of each cluster in the data area of the directory structure file, the number of used effective clusters in the data area of each block and the check code corresponding to the data area of each block.
The setting of the usage status of each cluster in this embodiment includes four types, respectively: 0 indicates unused, 1 indicates directory information, 2 indicates file information, 3 indicates file name or directory name,
In this embodiment, 16 bytes are set to 1 cluster, 128 clusters are set to 1 block, and each block includes a data area of 120 clusters and a digest area of 8 clusters.
(3) The data storage file is used for recording information of a file (such as a webpage file) corresponding to each file node. The structure of the data storage file is similar to that of the directory structure file, and as shown in fig. 4, the data storage file also includes a header, a data area, and a digest area.
The file header of the data storage file is used to store file information and structure information of the data storage file. The file information is the same as the file information in the file list file, and comprises file codes, file types and file check codes. It should be noted that, the file check code herein is an overall check code of the data area of all the blocks included in the entire data storage file. The configuration information in this embodiment includes a digest start cluster number and the total number of valid clusters of the data area.
The data area of the data storage file is used for storing data information of each file node, and in this embodiment, the data information includes check codes of the file and file storage path information. The check code of the file is a check code of the file corresponding to each file node, for example, the Md5 value of the web page file is different from the check code of the file in the file header. The file storage path information is full path information, and if one cluster is not enough to be stored, the cluster is added backwards.
The digest area of the data storage file is used for storing the use state of each cluster in the corresponding data area, the number of used effective clusters in the data area of each block, and the check code corresponding to the data area of each block. The usage status of each cluster stored in the summary area of the data storage file includes three kinds of usage status, respectively: 0 indicates unused, 1 indicates a check code of the file, and 2 indicates file storage path information.
Embodiment two: the invention also provides a data storage method, and the data storage system based on the first embodiment of the invention comprises the following steps:
(1) In the initial stage, a file list file, a directory structure file and a data storage file are created, and a space of one block is not allocated in advance to the directory structure file and the data storage file respectively.
(2) When a new file node is needed, acquiring the data information and the cluster number n occupied by the data information of the file corresponding to the new file node, searching whether the using states of the continuous n clusters exist in the abstract area of the data storage file or not according to the cluster number occupied by the data information, if so, storing the data information of the file into the continuous n clusters in the searched data area of the data storage file, and updating the using states of the continuous n clusters in the abstract area of the data storage file; if not, a block space is newly added in the data storage file, wherein the newly added block space comprises a data area and a summary area. In this embodiment, it is preferable to set all the data areas between different blocks to be connected, and all the digest areas to be connected, that is, when a block space is newly added, a 120-byte data area is added between the original data area and the digest area, and an 8-byte digest area is added below the original digest area. It should be noted that the contents of different digest areas are independent of each other, and each block has its corresponding digest area, that is, each digest area includes the number of valid clusters used in the data area of its corresponding block and the check code corresponding to the data area of its corresponding block.
When too many blocks are involved, the individual files become too large, at which point new data storage files may be re-created for segmented storage.
And acquiring directory address information and file address information of the newly added file node according to the file storage path information of the newly added file node, and updating or newly adding corresponding directory address information in a data area of the directory structure file according to the directory address information, and simultaneously, newly adding corresponding file address information. When directory address information or file address information is newly added in the data area, it is necessary to find whether the use state of clusters having the number of clusters required for continuously newly added information is unused from the digest area of the directory structure file in the same manner as the data storage file.
(3) When a file node needs to be deleted, only the use state of the cluster corresponding to the file node stored in the abstract areas of the data storage file and the directory structure file is set to be unused, and the actual information stored in the data area does not need to be deleted.
Further, the embodiment further includes checking whether the ratio of the total number of valid clusters of the data area stored in the file header of the data storage file and the directory structure file to the total number of all clusters of the data area is smaller than a ratio threshold (the ratio threshold is set to be 1/3 in the embodiment), if so, defragmenting the data areas of the data storage file and the directory structure file smaller than the ratio threshold, and deleting the redundant clusters according to the blocks. In this embodiment, the shredding is performed by making the valid clusters in the data area continuous in physical space, and updating the addresses of the corresponding file nodes stored in the directory structure file.
The total number of all clusters in the data area is the product of the number of blocks and the number of clusters in the data area contained in each block.
(4) When the file node needs to be modified, the method is divided into two cases, namely 1, when the cluster data occupied by the modified information is unchanged or reduced, the original address is directly covered, and 2, when the cluster data occupied by the modified information is increased, the method is carried out in a mode of deleting the old file node first and then adding the modified file node newly.
(5) When inquiring the file node, splitting the file storage path information corresponding to the file node according to the directory hierarchy, and then starting from the first directory structure file and starting from the root directory node.
(6) When the data in the directory structure file and the data storage file need to be judged whether to be changed or not, the method comprises the following steps:
s101: calculating file check codes according to the directory structure file and the data storage file, comparing the calculated file check codes with the file check codes stored in the file header, and judging that the data is unchanged when the file check codes are identical to the file check codes stored in the file header: otherwise, enter S102;
s102: and calculating the check code corresponding to the data area of each block according to the data areas of the directory structure file and the data storage file, comparing the calculated check code with the check code of the corresponding block stored in the abstract area, and comparing the data areas of the blocks which are different from each other according to bytes one by one to obtain a changed cluster.
Embodiment III:
the invention also provides a data storage terminal device, which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the steps in the method embodiment of the first embodiment of the invention are realized when the processor executes the computer program.
Further, as an executable scheme, the data storage terminal device may be a computing device such as a desktop computer, a notebook computer, a palm computer, a cloud server, and the like. The data storage terminal device may include, but is not limited to, a processor, a memory. It will be appreciated by those skilled in the art that the above-described constituent structures of the data storage terminal device are merely examples of the data storage terminal device and do not constitute a limitation of the data storage terminal device, and may include more or less components than those described above, or may combine some components, or different components, for example, the data storage terminal device may further include an input/output device, a network access device, a bus, etc., which is not limited in this embodiment of the present invention.
Further, as an implementation, the processor may be a central processing unit (Central Processing Unit, CPU), other general purpose processor, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, etc. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, which is a control center of the data storage terminal device, and which connects the various parts of the entire data storage terminal device using various interfaces and lines.
The memory may be used to store the computer program and/or the module, and the processor may implement various functions of the data storage terminal device by running or executing the computer program and/or the module stored in the memory and invoking data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, at least one application program required for a function; the storage data area may store data created according to the use of the cellular phone, etc. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as a hard disk, memory, plug-in hard disk, smart Media Card (SMC), secure Digital (SD) Card, flash Card (Flash Card), at least one disk storage device, flash memory device, or other volatile solid-state storage device.
The present invention also provides a computer readable storage medium storing a computer program which when executed by a processor implements the steps of the above-described method of an embodiment of the present invention.
The modules/units integrated in the data storage terminal device may be stored in a computer readable storage medium if implemented in the form of software functional units and sold or used as a stand alone product. Based on such understanding, the present invention may implement all or part of the flow of the method of the above embodiment, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the computer program may implement the steps of each of the method embodiments described above. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a software distribution medium, and so forth.
While the invention has been particularly shown and described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. A data storage system, comprising: file list files, directory structure files, and data storage files;
the file list file is used for storing file information and directory root node addresses of directory structure files and data storage files, and the file information comprises file codes, file types and file check codes;
the directory structure file and the data storage file each comprise a file header, a data area and a summary area, wherein:
the file header of the directory structure file is used for storing file information and structure information of the directory structure file; the file header of the data storage file is used for storing file information and structure information of the data storage file; the structure information of the directory structure file and the data storage file comprises the total number of effective clusters of the data area;
the data area of the directory structure file is used for storing address information of each file node, and the address information consists of directory address information and file address information, wherein: the directory address information comprises the length of directory names, the number of sub-nodes contained in the directory, the address of a higher-level directory node of the directory, the address of each sub-node contained in the directory and the directory names; the file address information comprises the length of a file name, the address of a father node corresponding to the file node, the file name and the address of the corresponding file node of the file node in the data storage file;
the data area of the data storage file is used for storing data information of each file node, and the data information comprises a check code of the file and file storage path information;
the summary areas of the directory structure file and the data storage file are used for storing the use state of each cluster in the corresponding data area, the number of used effective clusters in the data area of each block and the check code corresponding to the data area of each block.
2. The data storage system of claim 1, wherein: the usage status of each cluster stored in the abstract area of the directory structure file includes four types, which are respectively: unused, directory information, file name, or directory name.
3. The data storage system of claim 1, wherein: the usage status of each cluster stored in the summary area of the data storage file includes three kinds of usage status, respectively: unused, check code of file, file storage path information.
4. A data storage method, characterized in that: a data storage system according to any one of claims 1 to 3, comprising: when a new file node is needed, acquiring data information of the new file node according to a file corresponding to the new file node, and storing the data information corresponding to the new file node into a data area of a data storage file; according to the address and file storage path information stored in the data storage file of the data information corresponding to the newly added file node, acquiring the directory address information and the file address information corresponding to the newly added file node, and newly adding the file address information in the data area of the directory structure file, and updating or newly adding the directory address information.
5. The data storage method of claim 4, wherein: when information is newly added in a data area of a data storage file or a directory structure file, searching whether the use state of n continuous clusters exists in a corresponding abstract area or not is unused, n is the number of clusters required by the newly added information, and if the use state is the number of the clusters required by the newly added information, storing the newly added information into the n continuous clusters in the searched data area; otherwise, a space of a block is newly added in the data storage file or the directory structure file, and newly added information is stored in n continuous clusters in a data area of the newly added space; and after the newly added information is stored in the data area, updating the use states of the corresponding stored n clusters in the abstract area.
6. A data storage method, characterized in that: a data storage system according to any one of claims 1 to 3, comprising: when a file node needs to be deleted, the use state of the cluster corresponding to the file node stored in the abstract areas of the data storage file and the directory structure file is set to be unused, and the actual information stored in the data area is not deleted.
7. The data storage method of claim 6, wherein: when the file node needs to be deleted, the method further comprises the following steps: checking whether the ratio of the total number of the effective clusters of the data area to the total number of all clusters of the data area stored in the file heads of the data storage file and the directory structure file is smaller than a ratio threshold value, if so, defragmenting the data areas of the data storage file and the directory structure file which are smaller than the ratio threshold value, and deleting redundant clusters according to the blocks.
8. A data storage method, characterized in that: a data storage system according to any one of claims 1 to 3, comprising: when whether the directory structure file and the data storage file are tampered needs to be judged, the method comprises the following steps:
s101: calculating file check codes of the directory structure file and the data storage file according to the directory structure file and the data storage file, comparing the calculated file check codes of the directory structure file with file check codes stored in file headers of the directory structure file, comparing the calculated file check codes of the data storage file with file check codes stored in file headers of the data storage file, and judging that the directory structure file is not tampered when comparison results corresponding to the directory structure file are the same; when the comparison results corresponding to the data storage files are the same, judging that the data storage files are not tampered; when judging that the directory structure file or the data storage file is tampered, entering S102;
s102: and calculating the check code corresponding to the data area of each block according to the directory structure file or the data area of the data storage file, comparing the calculated check code with the check code of the corresponding block stored in the abstract area, and comparing the data areas of the blocks which are different from each other according to bytes one by one to obtain a changed cluster, thereby obtaining the tampered file.
9. A data storage terminal device characterized by: comprising a processor, a memory and a computer program stored in the memory and running on the processor, which processor, when executing the computer program, carries out the steps of the method according to any one of claims 4 to 8.
10. A computer-readable storage medium storing a computer program, characterized in that: the computer program implementing the steps of the method according to any of claims 4 to 8 when executed by a processor.
CN202111091089.3A 2021-09-17 2021-09-17 Data storage method, system, terminal equipment and storage medium Active CN113806803B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111091089.3A CN113806803B (en) 2021-09-17 2021-09-17 Data storage method, system, terminal equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111091089.3A CN113806803B (en) 2021-09-17 2021-09-17 Data storage method, system, terminal equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113806803A CN113806803A (en) 2021-12-17
CN113806803B true CN113806803B (en) 2023-06-02

Family

ID=78895681

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111091089.3A Active CN113806803B (en) 2021-09-17 2021-09-17 Data storage method, system, terminal equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113806803B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116991863B (en) * 2023-09-27 2023-12-01 深圳市前海数据服务有限公司 Data auxiliary analysis management system and method based on big data

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107239526A (en) * 2017-05-27 2017-10-10 河南思维轨道交通技术研究院有限公司 File system implementation method, scrap cleaning method, operating position localization method
CN108052541A (en) * 2017-11-22 2018-05-18 中国科学院上海微系统与信息技术研究所 The realization of file system based on multi-level page-table bibliographic structure, access method, terminal
CN112463753A (en) * 2020-11-06 2021-03-09 苏州浪潮智能科技有限公司 Block chain data storage method, system, equipment and readable storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101319465B1 (en) * 2011-09-27 2013-10-17 주식회사 미니게이트 File providing system for n-screen service
CN103092849A (en) * 2011-10-28 2013-05-08 浙江大华技术股份有限公司 File system cluster management method
CN108021717B (en) * 2017-12-29 2020-12-01 成都三零嘉微电子有限公司 Method for implementing lightweight embedded file system
CN111159114A (en) * 2019-12-30 2020-05-15 中国科学院寒区旱区环境与工程研究所 File storage method and device and server
CN111782625A (en) * 2020-06-30 2020-10-16 安徽芯智科技有限公司 Core intelligence technology embedded remote file system software

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107239526A (en) * 2017-05-27 2017-10-10 河南思维轨道交通技术研究院有限公司 File system implementation method, scrap cleaning method, operating position localization method
CN108052541A (en) * 2017-11-22 2018-05-18 中国科学院上海微系统与信息技术研究所 The realization of file system based on multi-level page-table bibliographic structure, access method, terminal
CN112463753A (en) * 2020-11-06 2021-03-09 苏州浪潮智能科技有限公司 Block chain data storage method, system, equipment and readable storage medium

Also Published As

Publication number Publication date
CN113806803A (en) 2021-12-17

Similar Documents

Publication Publication Date Title
US11954373B2 (en) Data structure storage and data management
US8843454B2 (en) Elimination of duplicate objects in storage clusters
Chen et al. The dynamic cuckoo filter
US9436558B1 (en) System and method for fast backup and restoring using sorted hashes
AU2017243870B2 (en) "Methods and systems for database optimisation"
CN109522283B (en) Method and system for deleting repeated data
US20040148306A1 (en) Hash file system and method for use in a commonality factoring system
CN102033924B (en) Data storage method and system
CN111046034A (en) Method and system for managing memory data and maintaining data in memory
CN104794123A (en) Method and device for establishing NoSQL database index for semi-structured data
CN103561057A (en) Data storage method based on distributed hash table and erasure codes
WO2017020576A1 (en) Method and apparatus for file compaction in key-value storage system
RU2665272C1 (en) Method and apparatus for restoring deduplicated data
CN112148217B (en) Method, device and medium for caching deduplication metadata of full flash memory system
CN113806803B (en) Data storage method, system, terminal equipment and storage medium
WO2024022330A1 (en) Metadata management method based on file system, and related device thereof
CN112380174B (en) XFS file system analysis method containing deleted files, terminal device and storage medium
CN111625186B (en) Data processing method, device, electronic equipment and storage medium
CN111444194B (en) Method, device and equipment for clearing indexes in block chain type account book
WO2020238750A1 (en) Data processing method and apparatus, electronic device, and computer storage medium
US11645333B1 (en) Garbage collection integrated with physical file verification
EP3436988B1 (en) "methods and systems for database optimisation"
Heidari et al. MetaHive: A Cache-Optimized Metadata Management for Heterogeneous Key-Value Stores
CN116910051B (en) Data processing method, device, electronic equipment and computer readable storage medium
US20230385240A1 (en) Optimizations for data deduplication operations

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant