CN114116611A - File scanning method and related device - Google Patents

File scanning method and related device Download PDF

Info

Publication number
CN114116611A
CN114116611A CN202010890951.6A CN202010890951A CN114116611A CN 114116611 A CN114116611 A CN 114116611A CN 202010890951 A CN202010890951 A CN 202010890951A CN 114116611 A CN114116611 A CN 114116611A
Authority
CN
China
Prior art keywords
directory
information
subfile
file
dirty
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010890951.6A
Other languages
Chinese (zh)
Inventor
刘志远
马明刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202010890951.6A priority Critical patent/CN114116611A/en
Publication of CN114116611A publication Critical patent/CN114116611A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a file scanning method and a related device, wherein the method comprises the following steps: receiving a scanning request aiming at a target storage device, and acquiring first file information; the first file information includes first directory information of a scanned file list of the target storage device, the directory information including a first modification timestamp for each directory of the scanned file list; acquiring second file information, wherein the second file information comprises second directory information of a file list to be scanned of the target storage device, and the directory information comprises a second modification timestamp of each directory in the file list to be scanned; comparing the first directory information with the second directory information to generate a dirty directory set; the dirty directory is a directory in which the first modification timestamp is inconsistent with the second modification timestamp or is present in the second directory information and is not present in the first directory information. The embodiment of the invention can improve the file scanning efficiency when the storage equipment is connected into the electronic equipment and the electronic equipment scans files.

Description

File scanning method and related device
Technical Field
The present invention relates to the field of file scanning technologies, and in particular, to a file scanning method and a related apparatus.
Background
Along with the development of electronic devices, application scenes of electronic devices such as smart screens, personal computers and tablet computers are increasingly diversified. After the storage device (mobile hard disk, U disk, etc.) is accessed to the electronic device, the electronic device scans the files in the storage device to obtain the file information of all the storage devices. At present, with the continuous expansion of the storage capacity of a storage device, the number of files that can be stored is increased, so that the number of files that need to be traversed in the file scanning process is increased.
For example, a user may want to view media information within a storage device. When the storage device is accessed to the electronic device, the electronic device scans the media file of the storage device, the scanning process includes traversing all files to be scanned in the storage device, and after the media file is found, the media file is correspondingly analyzed, so that a file scanning result (such as obtaining a thumbnail of the media file) is obtained and stored in the media database. Finally, the electronic device may obtain the media file information within the storage device by reading the data in the media database. When the storage device is removed from the electronic device, the electronic device will clear the scanning result of the storage device temporarily stored in the media database. Therefore, when the storage device is accessed to the electronic device again, the electronic device performs full disk scanning on the storage device again.
However, in some scenes that the storage device such as an intelligent screen is frequently plugged and unplugged, if the number of files on the storage device is large, a large amount of time is required for each full-disk scan, so that a user cannot view the content of the files in real time, and the user experience is reduced.
Therefore, how to improve the file scanning efficiency when the storage device is connected to the electronic device and the electronic device performs file scanning is an urgent problem to be solved.
Disclosure of Invention
The technical problem to be solved by the embodiments of the present invention is to provide a file scanning method and a related apparatus, so as to solve the problem of long effective rate of file scanning time when the same storage device is frequently accessed to the same electronic device.
In a first aspect, an embodiment of the present invention provides a document scanning method, which may include:
receiving a scanning request aiming at a target storage device, and acquiring first file information; the first file information comprises first directory information of a scanned file list stored by the target storage device, the first directory information comprising a first modification timestamp for each directory of the scanned file list; acquiring second file information, wherein the second file information comprises second directory information of a to-be-scanned file list stored in the target storage device, and the second directory information comprises a second modification timestamp of each directory in the to-be-scanned file list; comparing the first directory information with the second directory information to generate a dirty directory set; the dirty directories in the dirty directory set are directories with file names in the first directory information and the second directory information which are the same and the first modification time stamps and the second modification time stamps are inconsistent or directories which appear in the second directory information and do not appear in the first directory information. In the embodiment of the invention, when the same storage device is accessed to the electronic device for multiple times, the electronic device can determine the directory with two inconsistent modification timestamps as a dirty directory by comparing the last modification timestamp of the directory of the storage device with the modification timestamp of the current latest directory, and traverse the subfiles and subdirectories under the determined dirty directory, so as to further determine the specifically changed files or directories, thereby avoiding the full-disk traversal of the storage device, reducing the number of traversed files, shortening the time spent on the traversal, and improving the file scanning efficiency.
In a possible implementation manner, the obtaining of the first file information includes searching for the first file information matching a Universal Unique Identifier (UUID) of the storage device according to the UUID. In the embodiment of the invention, because a UUID is set for each storage device and the UUIDs corresponding to the storage devices are different, when the storage devices are accessed into the electronic device again, the electronic device searches the latest scanning result of the corresponding storage device in the database according to the UUID. By implementing the method of the embodiment of the invention, the electronic device can more quickly find the latest scanning result of the storage device when the storage device is accessed again under the condition that the scanning result of the storage device is not additionally marked.
In one possible implementation, the generating the dirty directory set includes, in a case where no directory exists in the first directory information, adding a root directory of a list of files to be scanned, which are stored by the storage device, to the dirty directory set. In the embodiment of the present invention, when the storage device is the electronic device accessed for the first time, the database of the electronic device has no record of the scanning result of the storage device, and in this case, initially, only the root directory of the file stored in the storage device is recorded as the dirty directory. And traversing the root directory, and if a new directory or a new dirty directory is found in the traversing process, adding the new directory or the new dirty directory into the dirty directory set. By implementing the method of the embodiment of the invention, the first file scanning of the storage device can be completed quickly.
In a possible implementation manner, after the comparing the first directory information with the second directory information, deleting a directory to be deleted from the first directory information, and deleting a scanning result corresponding to the directory to be deleted; the directory to be deleted is a directory which appears in the first directory information and does not appear in the second directory information. In the embodiment of the invention, after the storage device is pulled out, if a user deletes some catalogs stored in the storage device, the user accesses the electronic device again, the deleted catalogs have no current latest modification time stamp, but the last modification time stamp of the catalogs exists in the database of the electronic device. When the electronic device finds that one directory only has the last modification timestamp but does not have the latest current modification timestamp, the electronic device judges that the directory is deleted, and the electronic device deletes all information related to the directory in the database of the electronic device. By implementing the method of the embodiment of the invention, the electronic equipment can quickly find out the deleted directory and update the file scanning result of the storage equipment in the database.
In a possible implementation manner, when the dirty directory set is an empty set, the file scanning on the list of files to be scanned is ended. In the embodiment of the invention, after the storage device is pulled out, if the user does not change the file or the directory stored in the storage device, the modification timestamp of the file or the directory stored in the storage device cannot be changed. When the storage device is accessed to the electronic device again, and the electronic device compares the last modification time stamp of the directory with the same file name with the current latest modification time stamp, if the two time stamps corresponding to all the directories are consistent, and the directory which is not modified is determined, the process of scanning the file of the storage device is finished. By implementing the method of the embodiment of the invention, the repeated scanning of the unchanged file can be avoided, thereby reducing the time for a user to wait for the file scanning.
In one possible implementation, when the dirty directory set is a non-empty set, a file scan is performed on dirty directories in the dirty directory set. In the embodiment of the invention, after the storage device is pulled out, if a user changes the file or the directory stored in the storage device, the modification timestamp of the changed file or directory is changed. Therefore, the last modification time stamp of some directories is inconsistent with the latest current modification time stamp, and the electronic device only scans the directories, so that repeated scanning of unmodified directories can be avoided, and the number of files scanned by the files is reduced.
In one possible implementation, the scanning the file for the dirty directories in the set of dirty directories includes comparing the first modification timestamp and a second modification timestamp of the dirty directories in the set of dirty directories; deleting the dirty directory from the set of dirty directories for which the first modification timestamp is consistent with the second modification timestamp. In the embodiment of the invention, because a plurality of dirty directories can be scanned with files in parallel, before each dirty directory is scanned with files, whether the last modification time stamp of the dirty directory is consistent with the current latest modification time stamp can be determined again, if so, the dirty directory is deleted from the dirty directory set, and the dirty directories can be prevented from being scanned repeatedly.
In one possible implementation, recording attribute information of the dirty directory; the attribute information includes a second modified timestamp for the dirty directory. In the embodiment of the invention, before the changed directory is subjected to file scanning, the current latest modification timestamp of the changed directory is recorded, so that the file scanning result in the database can be updated conveniently.
In one possible implementation, all direct subfiles under the dirty directory are traversed; the direct subfiles include subfiles and subdirectories under the dirty directory. In the embodiment of the present invention, when scanning a dirty directory, only the direct subfiles of the dirty directory are traversed, and the files and directories under the direct subfiles do not need to be traversed, because when determining the initial dirty directory set in the embodiment of the present invention, all directories (including sub-directories) with changed timestamps are determined as dirty directories, and the embodiment of the present invention scans the direct subfiles and sub-directories under all dirty directories, so that only the files and directories under the direct subfiles under each dirty directory need to be scanned, that is, all directories or files with changed timestamps and all newly added directories in the storage device can be covered. By implementing the method of the embodiment of the invention, the number of the traversed files in the file scanning process can be reduced by traversing the direct sub-files under the dirty directory, and the traversal time is reduced, thereby improving the file scanning efficiency.
In one possible implementation manner, first subfile information is acquired; the first subfile information comprises a first sub-modification timestamp of the direct subfile of the dirty directory in the scanned file list; acquiring second subfile information; the second subfile information comprises a second modified timestamp of the direct subfile of the dirty directory in the list of files to be scanned; comparing the first subfile information and the second subfile information of the direct subfiles with the same file name to determine the changed direct subfile. In the embodiment of the invention, after the storage device is pulled out, if a user changes the direct subfile of the dirty directory stored in the storage device, the modification timestamp of the direct subfile is changed. And comparing the last modification time stamp of the direct subfile with the current latest modification time stamp, and if the two time stamps are not consistent, judging that the direct subfile is modified. By implementing the method of the embodiment of the invention, whether the direct subfiles in the dirty directory are changed or not can be judged more quickly.
In one possible implementation, the determining the altered direct subfile includes determining the direct subfile to be a modified direct subfile; the modified direct subfile is a direct subfile with the same file name in the first subfile information and the second subfile information and inconsistent first sub-modification timestamp and the second sub-modification timestamp; determining that the direct subfile is a deleted direct subfile; the deleted immediate subfile is an immediate subfile that does not appear in the second subfile information and that appears in the first subfile information. In the embodiment of the invention, if the direct subfile with the same file name has the last modification timestamp and the latest current modification timestamp, and the two modification timestamps are not consistent, the direct subfile is modified by the user after the storage device is disconnected, and the direct subfile is determined as the modified direct subfile. If the direct subfile with the same file name only has the last modification time stamp but does not have the latest current modification time stamp, the direct subfile is determined as the deleted direct subfile by deleting the direct subfile by the user after the storage device is disconnected.
In one possible implementation, the information of the deleted immediate subfile in the scanning result of the scanned file list is deleted. In the embodiment of the invention, when the electronic equipment finds that one direct subfile only has the last modification timestamp but does not have the latest current modification timestamp, the electronic equipment judges that the directory is deleted, and the electronic equipment deletes all information related to the direct subfiles in the own database. By implementing the method of the embodiment of the invention, the electronic equipment can quickly find out the deleted direct subfile and update the file scanning result of the storage equipment in the database.
In one possible implementation, whether the modified direct subfile is a directory is judged; if not, determining that the modified direct subfile is a modified subfile, acquiring the file attribute of the modified subfile, performing file analysis on the modified subfile, and updating the scanning result of the modified subfile in the scanning result of the scanned file list; if yes, determining that the modified direct sub-file is a modified sub-directory, acquiring the directory attribute of the modified sub-directory, setting the first sub-modification time stamp of the modified sub-directory as a negative number, and inserting the scanning result of the modified sub-directory into the scanning result of the scanned file list. In the embodiment of the invention, when the changed direct subfile under the dirty directory is found, the changed subfile is re-analyzed, and the file scanning result of the changed subfile is updated in the database of the electronic equipment; when the modified subdirectory is found, the last modification time stamp of the modified subdirectory is set to a negative number, because normally the value of the modification time stamp is a number greater than 0, and setting it to a negative number represents marking the modified subdirectory as dirty. The time stamp set to negative is not modified to the current latest time stamp until the relevant information of the modified subdirectory is updated successfully, so as to ensure that the modified subdirectories can be considered as dirty directories under any condition before the modified subdirectories are not updated completely (because the time stamp of negative number is different from any normal time stamp).
In one possible implementation manner, the inserting the scan result of the modified child directory into the scan result of the scanned file list further includes: judging whether the scanning result of the modified subdirectory is successfully inserted into the scanning result of the scanned file list or not; in the event that the insertion is successful, adding the modified subdirectory to the set of dirty directories. In the embodiment of the invention, after the modification time stamp of the modified subdirectory is set as a negative number, the modified subdirectory is inserted into the scanning result, and if the insertion is successful, the modified subdirectory is a new directory and is added into the dirty directory set.
In one possible implementation, the method further includes determining that all of the direct subfiles of the dirty directory have been traversed; updating the scanning result of the dirty catalogue at present; the scan result of the current dirty directory includes a first modified timestamp for the dirty directory. Optionally, when a directory with a negative modification timestamp is found, the negative value of the timestamp is updated to the current latest modification timestamp. In the embodiment of the invention, after traversing all the direct subfiles under a dirty directory, the first modification timestamp of the dirty directory is changed into the current latest modification timestamp, and finally the scanning result of the dirty directory in the database of the electronic equipment is updated.
In one possible implementation, the first modified timestamps of the directories in all the scanning results in the time period T3- Δ T are set to be negative numbers; the T3 is the time when the storage device offload message is received; and the delta T is a preset time period. Optionally, Δ T is a corresponding empirical time difference obtained according to different systems. Note that generally Δ T is greater than or equal to (T3-T1), and T1 is the time when the memory device is disconnected from the electronic device. In the embodiment of the present invention, by setting the first modified timestamps of the directories in all the scanning results in the time period T3- Δ T as negative numbers, when the electronic device scans the files again on the storage device, since the timestamp that is a negative value is definitely not consistent with any timestamp in the normal case, the files will be determined as dirty directories in the process of forming the dirty directories, and thus the files will be traversed and parsed again. By the method, the problem that inconsistent results are obtained by calling the file system interface when the storage device is subjected to hot plug and abnormal interruption can be solved, the stability of file scanning is improved, and the usability in an actual scene is improved.
In a second aspect, the present application provides a document scanning apparatus, which may include:
a receiving module, configured to receive a scan request for a target storage device;
the first acquisition module is used for acquiring first file information; the first file information comprises first directory information of a scanned file list stored by the target storage device, the first directory information comprising a first modification timestamp for each directory of the scanned file list;
a second obtaining module, configured to obtain second file information, where the second file information includes second directory information of a to-be-scanned file list stored in the target storage device, and the second directory information includes a second modification timestamp of each directory in the to-be-scanned file list;
the processing module is used for comparing the first directory information with the second directory information to generate a dirty directory set; the dirty directories in the dirty directory set are directories with file names in the first directory information and the second directory information which are the same and the first modification time stamps and the second modification time stamps are inconsistent or directories which appear in the second directory information and do not appear in the first directory information.
In the embodiment of the invention, in the file scanning device, the receiving module receives the file scanning request at first, the first acquiring module acquires the first file information, the second acquiring module acquires the second file information, the processing module compares the first directory information in the acquired first file information with the second directory information in the acquired second file information, and then the processing module traverses the modified directory, so that the full-disk traversal of the storage device is avoided, the number of traversed files is reduced, the time spent on the traversal is shortened, and the file scanning efficiency is improved. In a possible implementation manner, the first obtaining module is specifically configured to: and searching the first file information matched with the UUID according to the Universal Unique Identifier (UUID) of the storage equipment.
In a possible implementation manner, the processing module is specifically configured to: and adding a root directory of a list of files to be scanned stored by the storage device to the dirty directory set under the condition that no directory exists in the first directory information.
In one possible implementation, the apparatus further includes: the deleting module is used for deleting the directory to be deleted from the first directory information and deleting the scanning result corresponding to the directory to be deleted after the first directory information is compared with the second directory information; the directory to be deleted is a directory which appears in the first directory information and does not appear in the second directory information.
In a possible implementation manner, the processing module is specifically configured to: and when the dirty directory set is an empty set, ending the file scanning of the file list to be scanned.
In a possible implementation manner, the processing module is specifically configured to: and when the dirty directory set is a non-empty set, carrying out file scanning on the dirty directories in the dirty directory set.
In a possible implementation manner, the processing module and the deleting module are specifically configured to: a processing module to compare the first modification timestamp and a second modification timestamp for a dirty directory in the dirty directory set; and the deleting module is used for deleting the dirty directory with the first modification timestamp consistent with the second modification timestamp from the dirty directory set.
In one possible implementation, the apparatus further includes: the recording module is used for recording the attribute information of the dirty directory; the attribute information includes a second modified timestamp for the dirty directory.
In a possible implementation manner, the processing module is specifically configured to: traversing all direct subfiles under the dirty directory; the direct subfiles include subfiles and subdirectories under the dirty directory.
In a possible implementation manner, the first obtaining module, the second obtaining module, and the processing module are specifically configured to: the first obtaining module is used for obtaining the first subfile information; the first subfile information comprises a first sub-modification timestamp of the direct subfile of the dirty directory in the scanned file list; the second obtaining module is used for obtaining the information of the second subfile; the second subfile information comprises a second modified timestamp of the direct subfile of the dirty directory in the list of files to be scanned; and the processing module is used for comparing the first subfile information and the second subfile information of the direct subfiles with the same file names to determine the changed direct subfile.
In a possible implementation manner, the processing module is specifically configured to: determining that the direct subfile is a modified direct subfile; the modified direct subfile is a direct subfile with the same file name in the first subfile information and the second subfile information and inconsistent first sub-modification timestamp and the second sub-modification timestamp; determining that the direct subfile is a deleted direct subfile; the deleted immediate subfile is an immediate subfile that does not appear in the second subfile information and that appears in the first subfile information.
In a possible implementation manner, the deleting module is specifically configured to: deleting the information of the deleted direct subfile in the scanning result of the scanned file list.
In a possible implementation manner, the processing module is specifically configured to: judging whether the modified direct subfile is a directory or not; if not, determining that the modified direct subfile is a modified subfile, acquiring the file attribute of the modified subfile, performing file analysis on the modified subfile, and updating the scanning result of the modified subfile in the scanning result of the scanned file list; if yes, determining that the modified direct sub-file is a modified sub-directory, acquiring the directory attribute of the modified sub-directory, setting the first sub-modification time stamp of the modified sub-directory as a negative number, and inserting the scanning result of the modified sub-directory into the scanning result of the scanned file list.
In a possible implementation manner, the processing module is specifically configured to: judging whether the scanning result of the modified subdirectory is successfully inserted into the scanning result of the scanned file list or not; in the event that the insertion is successful, adding the modified subdirectory to the set of dirty directories.
In a possible implementation manner, the processing module is specifically configured to:
determining that all of the direct subfiles of the dirty directory have been traversed; updating the scanning result of the dirty catalogue at present; the scan result of the current dirty directory includes a first modified timestamp for the dirty directory.
In a possible implementation manner, the processing module is specifically configured to: setting the first modification time stamps of the catalogues in all the scanning results in the time period T3-delta T as negative numbers; the T3 is the time when the storage device offload message is received; and the delta T is a preset time period.
In a third aspect, the present application provides a computer storage medium, which may include: for storing computer software instructions for use in a processing module in a document scanning apparatus provided for the second aspect, comprising a program designed for executing the above aspect.
In a fourth aspect, the present application provides a computer program, which may comprise: the computer program includes instructions that, when executed by a computer, cause the computer to execute the flow executed by the processing module in the document scanning apparatus in the second aspect.
In a fifth aspect, an embodiment of the present invention provides an electronic device, where the electronic device includes a processor, and the processor is configured to support the electronic device to implement a corresponding function in the file scanning method provided in the first aspect. The electronic device may also include a memory, coupled to the processor, that stores program instructions and data necessary for the electronic device. The electronic device may also include a communication interface for the electronic device to communicate with other devices or a communication network.
In a sixth aspect, the present application provides a chip system, which includes a processor, configured to enable an electronic device to implement the functions referred to in the first aspect, for example, to generate or process information referred to in the file scanning method. In one possible design, the system-on-chip further includes a memory for storing program instructions and data necessary for the electronic device. The chip system may be constituted by a chip, or may include a chip and other discrete devices.
Drawings
Fig. 1 is a schematic diagram of a system architecture for scanning a file according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of a file scanning system architecture of an electronic device according to an embodiment of the present invention.
Fig. 3 is a flowchart illustrating a file scanning method according to an embodiment of the present invention.
Fig. 4 is a schematic diagram of first access file information according to an embodiment of the present invention.
Fig. 5 is a schematic diagram of information of a re-access file according to an embodiment of the present invention.
Fig. 6A is a schematic flowchart of generating a dirty directory set by file scanning according to an embodiment of the present invention.
Fig. 6B is a schematic view of a file scanning process according to an embodiment of the present invention.
Fig. 7A is a schematic diagram of first file information and second file information of a mobile hard disk inserted into an intelligent screen for the first time according to an embodiment of the present invention.
Fig. 7B is a schematic diagram of direct subfile information when a storage device first accesses a root directory according to an embodiment of the present invention.
Fig. 8A is a schematic diagram of first subfile information and second subfile information of an a-directory direct subfile according to an embodiment of the present invention.
Fig. 8B is a schematic diagram of first subfile information and second subfile information of a direct subfile of a C-directory according to an embodiment of the present invention.
Fig. 8C is a schematic diagram illustrating first subfile information and second subfile information of the direct subfile of the X directory according to the embodiment of the present invention.
Fig. 9 is a schematic diagram of scanning a file accessed by a storage device for the first time according to an embodiment of the present invention.
FIG. 10 is a schematic diagram of first file information and second file information of a smart screen re-inserted into a mobile hard disk according to an embodiment of the present invention.
Fig. 11A is a schematic diagram of first subfile information and second subfile information of a direct subfile of the reinsertion a directory according to an embodiment of the present invention.
Fig. 11B is a schematic diagram of the first subfile information and the second subfile information of the C-directory direct subfile reinserted according to the embodiment of the present invention.
Fig. 11C is a schematic diagram illustrating first subfile information and second subfile information of an E-directory direct subfile according to an embodiment of the present invention.
Fig. 12 is a schematic diagram of scanning a storage device to access a file again according to an embodiment of the present invention.
Fig. 13 is a schematic structural diagram of a document scanning apparatus according to an embodiment of the present invention.
Detailed Description
The embodiments of the present invention will be described below with reference to the drawings.
The terms "first," "second," "third," and "fourth," etc. in the description and claims of this application and in the accompanying drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
Hereinafter, some terms in the present application are explained to facilitate understanding by those skilled in the art.
(1) The Universally Unique Identifier (UUID) is a 128-bit value, and any UUID does not have the same value.
(2) File System (FS) File systems are methods and data structures used by operating systems to disambiguate files on a disk or partition. I.e. a method of organizing files on a disk. But also to the disk or partition used to store the file, or file system class.
(3) The Virtual File System (VFS) has the function of calling and reading different File systems on different physical media, and provides a unified operation interface and application programming interface for various File systems. The VFS is a bond layer that allows system calls such as open (), read (), write (), and the like to operate without regard to the underlying storage medium and file system type.
First, the technical problems to be specifically solved by the present application are analyzed and presented. In the prior art, the document scanning technology includes the following first scheme and second scheme:
the first scheme is as follows: the method includes that a full-disk traversal and full-disk file parsing process is performed every time the storage device accesses the electronic device, and specifically includes the following steps 1-5:
step 1, a storage device is accessed into an electronic device, and an operating system of the electronic device starts a media scanning process to scan files in a file list to be scanned, which is stored in the storage device.
And 2, the electronic equipment traverses the file list to be scanned stored in the storage equipment to obtain all information of all files in the file list to be scanned of the storage equipment.
And 3, in the traversing process, if the electronic equipment traverses the media files in the file list to be scanned stored in the storage equipment, carrying out file analysis on the media files to obtain the analysis result of the media files. For example, a media file of a picture type is analyzed to generate a thumbnail of the media file of the picture type.
And 4, temporarily storing the scanning result of the electronic equipment to the file list to be scanned stored in the storage equipment in a media database of the electronic equipment.
And 5, the storage device disconnects the electronic device, and the electronic device clears the corresponding file scanning result obtained by scanning the file of the storage device, which is stored in the media database.
The scheme has multiple application scenes and low complexity of a scanning method, but has the following defects:
disadvantage 1: in the file scanning process of the electronic equipment, file traversal and media file analysis are carried out together, and the process of media file analysis is time-consuming. Under the condition that the storage equipment has large capacity and many media files, the scanning time is longer and the scanning efficiency is low.
And (2) disadvantage: on some electronic equipment, for example, the wisdom screen, the scene of frequent plug external storage device is more, and the file scanning of current electronic equipment can clear away external storage device's scanning result, and when acceping again storage device still can carry out full traversal and full analysis again, and the scanning time is longer, and scanning efficiency is low.
Scheme II: the method includes the steps of respectively establishing file information databases corresponding to different storage devices, and not clearing contents in the corresponding file information databases after the storage devices are disconnected, which specifically includes the following steps 1-5.
Step 1, a storage device accesses an electronic device, and the electronic device traverses all files in a file list to be scanned stored in the storage device to acquire all file information in the file list to be scanned stored in the storage device.
And 2, the electronic equipment performs matching according to the storage equipment information, finds a file information database which can be matched with the storage equipment, and judges whether a corresponding file analysis result exists in the file information database.
And 3, if the file information database has the file analysis result of the storage equipment, comparing the data in the file information database with the full disk file information obtained through traversal, and skipping the step of analyzing the file if the file is not modified.
And 4, if the file is modified, re-analyzing the file.
And 5, writing a file scanning result of a file list to be scanned stored in the storage device into the file information database corresponding to the storage device by the electronic device, and storing the file scanning result.
The second scheme can avoid repeatedly analyzing the unmodified file when the storage device is accessed again, so that the file scanning efficiency is improved, but the following defects also exist:
the first disadvantage is that: under the conditions that the storage device has a large storage capacity and the number of the stored files is large, the electronic device performs full-disk traversal on the storage device, and the process of acquiring all file information of the file list to be scanned of the storage device is time-consuming.
In summary, the existing file scanning method has long scanning time and low scanning efficiency under the conditions of large storage capacity and large number of files in the storage device, resulting in poor user experience. Therefore, the document scanning method provided in the present application is used to solve the above technical problem.
In order to facilitate understanding of the embodiments of the present invention, the following exemplifies the application scenarios of the document scanning method in the present application. Two scenarios may be included:
scene one: and frequently inserting the mobile hard disk into a smart screen scene. In this scenario, the mobile hard disk corresponds to a storage device in an embodiment of the present invention, and the smart screen corresponds to an electronic device in an embodiment of the present invention. With the development of science and technology, an intelligent screen is derived on the basis of a traditional television. The intelligent screen bears multiple roles in a family, not only a family audio and video entertainment center, but also an information sharing center, a control management center and a multi-device interaction center. The intelligent screen displays file information stored in the mobile hard disk by accessing the mobile hard disk, for example, a movie stored in the mobile hard disk is watched through the intelligent screen. After the smart screen is connected to the mobile hard disk, the smart screen can scan the stored files and then play the movies stored in the mobile hard disk. If the same mobile hard disk is plugged and unplugged on the same intelligent screen for multiple times, the file scanning method in the application can shorten the time for a user to wait for file scanning, improve the file scanning efficiency and further improve the user experience.
Scene two: and synchronizing and backing up files in the cloud disk to the personal computer. In this scenario, the cloud disk corresponds to a storage device in an embodiment of the present invention, and the personal computer corresponds to an electronic device in an embodiment of the present invention. In the present day when the amount of information is drastically increased, a user has many videos, pictures, texts and other data to view on a personal computer. Meanwhile, the demand of users for viewing personal data in real time is increasing. Under the condition that the data can be guaranteed not to be lost and the data is secret enough, the user is willing to upload the personal data to the cloud disk and use the cloud disk for storage. In the process of synchronizing and backing up the files in the cloud disk to the personal computer, the files in the cloud disk need to be scanned in a full disk mode, and in order to improve the speed of file scanning, the files needing to be synchronized and backed up can be more quickly searched out by using the file scanning method in the application before the full disk scanning is carried out on the cloud disk, so that the full disk scanning is avoided, and the efficiency of synchronizing and backing up the files in the cloud disk to the personal computer is improved.
It is understood that the above two application scenarios are only a few exemplary implementations in the embodiments of the present invention, and the application scenarios in the embodiments of the present invention include, but are not limited to, the above application scenarios.
Embodiments of the present application are described below with reference to the drawings.
Based on the technical problems and the corresponding application scenarios in the present application, in order to facilitate understanding of the embodiments of the present invention, a system architecture based on the embodiments of the present invention is described below. Referring to fig. 1, fig. 1 is a schematic diagram of a system architecture for file scanning according to an embodiment of the present invention, where the system is used to solve the problem of low efficiency of frequently plugging and unplugging files on a same electronic device by a storage device. The system architecture may include a storage device and an electronic device. Wherein,
the storage device 101, which is a device for storing information in the present application, generally digitizes information and stores the information in a medium using an electric, magnetic, optical, or other means. Common storage devices include memory, hard disk, usb disk, cloud disk, and the like. For example, the mobile hard disk adopts USB and IEEE1394 interfaces, can be plugged in or unplugged from at any time, is small and portable, and can perform data transmission with a system at a higher speed. The mobile hard disk is connected to an interface of the electronic equipment, and can transmit data by a system of the electronic equipment.
The electronic device 102 is referred to as an electronic device having an operating system and a data transmission interface in this application. Common electronic devices include smart screens, personal computers, tablet computers, and the like. For example, the wisdom screen, wisdom screen derive on the basis of traditional TV screen, and wisdom screen possesses data transmission interfaces such as USB interface, HDMI interface, can connect storage device such as portable hard drive, USB flash disk, USB card reader. After the storage equipment is accessed, the intelligent screen and the storage equipment can establish data transmission to acquire the storage content in the storage equipment.
It is to be understood that a document scanning system architecture in fig. 1 is only an exemplary implementation in the embodiments of the present application, and the document scanning system architecture in the embodiments of the present application includes, but is not limited to, the above system architecture.
The following describes a file scanning system architecture in the electronic device according to an embodiment of the present invention. Referring to fig. 2, fig. 2 is a schematic diagram of a file scanning system architecture of the electronic device according to an embodiment of the present invention, where the file scanning system architecture includes a request processing module 201, a traversal module 202, a parsing module 203, and a data module 204.
The request processing module 201 is responsible for processing the scan request from the outside. For example, the request module includes a broadcast receiver and a scan manager. After receiving a scanning request broadcast from the android kernel mount broadcast or initiated by a third-party application, the broadcast receiver sends an instruction to the scanning manager, and the scanning manager is stimulated to send a file scanning request to a request processor in the traversal module.
The traversal module 202 is responsible for traversing modified or newly added directories and their direct subfiles on the corresponding storage devices. For example, the traverse module includes a request handler, a scan task queue, and a task pool. After the module receives the message sent by the request processing module, the request processor searches for first file information corresponding to the storage device according to a Universal Unique Identifier (UUID) of the storage device, then finds a first modification timestamp of the directory according to first directory information in the first file information, and compares the first modification timestamp with a second modification timestamp of the current directory to obtain a dirty directory set. And according to the scanning task queue, putting the dirty directories into a task pool, traversing the dirty directories, and adding the directories into the scanning task queue when finding new directories or new dirty directories.
The parsing module 203 is responsible for parsing the file. For example, when it is determined that a file needs to be parsed, the parsing module may create a file object, obtain a file encoding read file stream, then read an information stream of the file, cache the information stream in the memory, then read file information, complete a parsing process after processing, and obtain a thumbnail of a media file after parsing, for example.
The data module 204 is responsible for saving the scanned data results. For example, after the traversal and parsing of the file are completed, the database records the scanning result of the file scanning, including the file parsing result and the updated directory entry information. The information recorded by the database includes, but is not limited to, path, parent path, modification timestamp, and file type.
For example, suppose that in scenario one scenario the mobile hard disk has frequent access to the smart screen scenario. In this scenario, the mobile hard disk corresponds to a storage device in the embodiment of the present invention, and the smart screen corresponds to an electronic device in the embodiment of the present invention. When the mobile hard disk is inserted into the smart screen interface again, the file scanning request processing module 201 sends an instruction to the scanning manager after receiving a scanning request broadcast initiated by an android kernel or a third-party application, and the scanning of the file stored in the mobile hard disk is initiated.
The data module 204 will retrieve the directory information recorded by the last scan of the mobile hard disk according to the Universal Unique Identifier (UUID) of the storage device, and compare the directory information with the directory information accessed this time, so as to form a dirty directory set.
The traversal module 202 traverses the dirty directory set to obtain all the file information of all the subdirectories and the direct subfiles of the subdirectories under the dirty directory. The modified file is then found and the parsing module 203 re-parses the modified file.
Finally, the data module 204 updates and stores the result of the scanning. And if the mobile hard disk is accessed to the intelligent screen for the first time, adding all file root directories in the mobile hard disk to the dirty directory set for traversing.
The following describes a specific method architecture on which the embodiments of the present invention are based. Referring to fig. 3, fig. 3 is a schematic flowchart of a file scanning method in an embodiment of the present application, and the file scanning method in the embodiment of the present application will be described below with reference to fig. 3 from an interaction side of a storage device and an electronic device based on the file scanning system architecture in fig. 1. It should be noted that, in order to describe the file scanning method in the embodiment of the present application in more detail, in each flow step of the present application, it is described that the corresponding execution main body is an electronic device, but this does not mean that the embodiment of the present application can only perform the corresponding method flow through the described execution main body.
Step S301: and receiving a scanning request aiming at the target storage device, and acquiring first file information.
Specifically, the first file information includes first directory information of a file scanning result stored in a database after the storage device is accessed to the electronic device last time and the electronic device scans files of the storage device, where the first directory information is information of all directories in the scanning result, and the first modification timestamp is a modification timestamp of a directory recorded after the storage device is scanned last time. For example, as shown in fig. 4 and 5, a schematic diagram of first accessing file information and a schematic diagram of second accessing file information are shown, when the storage device is accessed for the first time, the file information stored in the electronic device is empty, and thus the first file information is empty. When the storage device is accessed again, the file information stored in the electronic device is as shown in fig. 5, so that the first directory information in the first file information includes a root directory "/", an a directory, a C directory, and an X directory, the first modification timestamp of the a directory is 8:00, and the first modification timestamp of the C directory is 8:00, the first modified timestamp of the X directory is 5: 00. It is to be understood that the first file information may be obtained by scanning after the storage device is first connected to the electronic device, or may be obtained by scanning after the storage device is connected to the electronic device for multiple times.
Step S302: and acquiring second file information.
Specifically, after the storage device is disconnected after the last scanning process of the storage device is completed, the user may modify, delete, or add a new file stored in the storage device. When the storage device is accessed to the electronic device again, the electronic device obtains second file information, wherein the second file information comprises second directory information obtained by the storage device accessing to the electronic device this time according to the list information of the files to be scanned stored in the storage device, the second directory information is all directory information of the file list to be scanned, and the second modification timestamp is modification time recorded by the latest modification of the directory. For example, as shown in fig. 4 and fig. 5, a schematic diagram of file information accessed for the first time and a schematic diagram of file information accessed again are shown, when the storage device accesses for the first time, file information stored in the storage device is as shown in fig. 4, so that the second directory information in the second file information includes a root directory "/", an a directory, a C directory, and an X directory, the second modification timestamp of the a directory is 8:00, and the second modification timestamp of the C directory is 8:00, the second modified timestamp of the X directory is 5: 00. When the storage device is accessed again, the file information stored in the storage device is as shown in fig. 5, so that the second directory information in the second file information includes a root directory "/", an a directory, a C directory, an X directory, and an E directory, the second modification timestamp of the a directory is 10:00, and the second modification timestamp of the C directory is 9:00, the second modified timestamp for X directory is 5:00 and the second modified timestamp for E directory is 10: 00.
step S303: and comparing the first directory information with the second directory information to generate a dirty directory set.
Specifically, the electronic equipment compares the first modification time stamp and the second modification time stamp of the directory with the same file name in the first directory information and the second directory information, and if the two time stamps are not consistent, the directory is marked as a dirty directory and added to a dirty directory set. For example, in fig. 4, 5, only the root directory "/" is added to the dirty directory set when the storage device first accesses. When the storage device is accessed again, the first modification time stamps of the A directory, the C directory and the X directory are compared with the second modification time stamps to find inconsistency, and then the A directory and the C directory are added to the dirty directory set.
It should be further noted that, the dirty directory is extracted from the dirty directory set, and the direct subfiles under the dirty directory are traversed, which can greatly reduce the number of files for file traversal in the file list to be scanned stored in the storage device. The direct subfiles include subdirectories and subfiles under the dirty directory. For example, in the file information stored in the electronic device in fig. 5, the direct subfiles of the "/" root directory "are the a directory, the X directory, and the C directory.
By using the method of the embodiment of the invention, when the electronic equipment is accessed into the storage equipment again, the electronic equipment can more quickly find the modified or newly added file, analyze the modified or newly added file and update the scanning result of the scanned file list in the electronic equipment database. The new file scanning mode is provided, the file scanning time of reinserting the storage device to wait for the electronic device to scan the file list to be scanned stored in the storage device is reduced, the file scanning efficiency is improved, and the user experience is improved.
When the electronic device scans files in a to-be-scanned file list stored in the storage device, an operating system of the electronic device can be an android system or other operating systems, and when the electronic device receives a file scanning request, as long as a dirty directory set is formed at first and only the direct subfiles under the dirty directory need to be traversed, the number of the files to be traversed can be greatly reduced, the problems in the prior art can be solved, and corresponding effects can be achieved.
The following describes a specific method architecture on which the embodiments of the present invention are based. Referring to fig. 6A and fig. 6B, fig. 6A is a detailed flowchart of generating a dirty directory set by file scanning in an embodiment of the present application, and fig. 6B is a flowchart of file scanning provided in an embodiment of the present application, and a file scanning method in an embodiment of the present application will be described below with reference to fig. 6A and fig. 6B from an interaction side of a storage device and an electronic device based on the file scanning system architecture in fig. 1. It should be noted that, in order to describe the file scanning method in the embodiment of the present application in more detail, in each flow step of the present application, it is described that the corresponding execution main body is an electronic device, but this does not mean that the embodiment of the present application can only perform the corresponding method flow through the described execution main body.
Step S601: a scan request is received for a target storage device.
Specifically, in the embodiment of the present invention, before scanning a file on a storage device, an electronic device needs to receive a scan request for a target storage device. Before the electronic device receives the scanning request, different storage devices are connected to the electronic device to mount the electronic device. Therefore, after the storage device is accessed to the electronic device, the electronic device analyzes the file system structure stored in the storage device aiming at the storage device, and obtains the file system type supported by the operating system of the electronic device. And the operating system calls the corresponding driver according to the file system type of the storage device, processes the metadata of the driver, attaches the information to a directory tree and presents the information, so that the operating system of the electronic device brings the storage device into the file system of the operating system of the electronic device, and the mounting operation is completed. After the mount operation is completed, the mount broadcast is sent by an operating system (such as an android system), and after the mount broadcast is received by the electronic device, file scanning is performed on the target storage device.
Step S602: first file information is acquired.
In a possible implementation manner, the obtaining of the first file information includes searching for the first file information matching a Universal Unique Identifier (UUID) of the storage device according to the UUID. Specifically, since the electronic device sets a UUID for each storage device and the UUID corresponding to each storage device is different, when the storage device accesses the electronic device again, the electronic device searches the latest scanning result of the corresponding storage device in the database according to the UUID. It should be noted that the first file information includes information of some directories, which were stored in the electronic device database when the storage device was last accessed to the electronic device. Obtaining first directory information of a scanned file list stored in the storage device according to the first file information, where the first directory information includes a first modification timestamp of each directory in the scanned file list, and the first modification timestamp is a last modification timestamp of each directory stored in the electronic device database after file scanning is completed.
Step S603: and acquiring second file information.
Specifically, after the storage device is mounted in the system of the electronic device, the storage device is incorporated into a file system of the electronic device in an operating system of the electronic device, and the electronic device may obtain the second file information. The second file information includes some directory information in a to-be-scanned file list stored in the target storage device, the second directory information includes a second modification timestamp of each directory in the to-be-scanned file list, and the second modification timestamp is a modification timestamp recorded after each directory in the to-be-scanned file list stored in the storage device is currently changed last time.
Optionally, after the first file information and the second file information are acquired, step S6031 is executed to indicate a first modification time stamp in the acquired first file information by t and indicate a second modification time stamp by t'.
Step S604: and comparing the first directory information with the second directory information to generate a dirty directory set.
Specifically, after the electronic device obtains the last scanning result of file scanning on the storage device, the last modification time stamps of all directories are obtained according to the scanning result, the last modification time stamps corresponding to all directories are compared with the current latest modification time stamp, and if the last modification time stamps of the directories with the same file name are found to be inconsistent with the current latest modification time stamp, step S6042 is executed to add the directories to the dirty directory set, so as to generate an initial dirty directory set. If the last modification timestamp of the directory with the same file name is consistent with the latest modification timestamp, step S6041 is executed to indicate that the directory and its direct subfiles are not modified and no new file is added.
In one possible implementation, the generating the dirty directory set includes, in a case where no directory exists in the first directory information, adding a root directory of a list of files to be scanned, which are stored by the storage device, to the dirty directory set. Specifically, when the storage device is the first time electronic device is accessed, the database of the electronic device has no record of the scanning result of the storage device, and in this case, only the root directory of the file stored in the storage device is initially recorded as the dirty directory. And traversing the root directory, and if a new directory or a new dirty directory is found in the traversing process, adding the new directory or the new dirty directory into the dirty directory set.
In a possible implementation manner, after the comparing the first directory information with the second directory information, deleting a directory to be deleted from the first directory information, and deleting a scanning result corresponding to the directory to be deleted; the directory to be deleted is a directory which appears in the first directory information and does not appear in the second directory information. Specifically, after the storage device is pulled out, if the user deletes some directories stored in the storage device, and then accesses the electronic device again, the deleted directories have no current latest modification time stamps, but the last modification time stamps of the directories exist in the database of the electronic device. When the electronic device finds that one directory only has the last modification timestamp but does not have the latest current modification timestamp, the electronic device judges that the directory is deleted, and the electronic device deletes all information related to the directory in the database of the electronic device.
Step S605: and judging whether the dirty directory set is an empty set.
Specifically, when the electronic device starts to scan the file of the dirty directory set, it first determines whether the dirty directory set is an empty set. In a possible implementation manner, when the dirty directory set is an empty set, the file scanning on the list of files to be scanned is finished. Specifically, after the storage device is pulled out, if the user does not change the file or directory stored in the storage device, the modification timestamp of the file or directory stored in the storage device will not be changed. When the storage device is accessed to the electronic device again, and the electronic device compares the last modification time stamp of the directory with the same file name with the current latest modification time stamp, if the two time stamps corresponding to all the directories are consistent, it is determined that the directory is not changed, step S6051 is executed, and the process of scanning the file of the storage device is ended. In one possible implementation, when the dirty directory set is a non-empty set, a file scan is performed on dirty directories in the dirty directory set. Specifically, after the storage device is pulled out, if a user changes a file or a directory stored in the storage device, a modification timestamp of the changed file or directory may be changed. Thus, the last modification timestamp of some directories is inconsistent with the latest modification timestamp, the electronic device performs step S6052 to retrieve the dirty directory from the dirty directory set, and then performs a file scan on the dirty directory. It should be noted that, in the process of fetching the dirty directory from the dirty directory set for file scanning, multiple dirty directories may be processed simultaneously.
In one possible implementation, the scanning the file for the dirty directories in the set of dirty directories includes comparing the first modification timestamp and a second modification timestamp of the dirty directories in the set of dirty directories; deleting the dirty directory from the set of dirty directories for which the first modification timestamp is consistent with the second modification timestamp. Specifically, since the electronic device can perform parallel file scanning on a plurality of dirty directories simultaneously, before performing file scanning on each dirty directory, step S6053 is performed, it can be determined again whether the last modification timestamp of the dirty directory matches the current latest modification timestamp, and if so, the dirty directory is deleted from the dirty directory set, so that repeated scanning on the dirty directory can be avoided.
Step S606: and recording the attribute information of the dirty directory.
Specifically, the electronic equipment records attribute information of the dirty directory; the attribute information includes a second modified timestamp for the dirty directory. Before the changed directory is scanned, the latest current modification timestamp of the changed directory is recorded, so that the file scanning result in the database can be updated later.
Step S607: all direct subfiles under the dirty directory are traversed.
Specifically, the electronic device traverses all direct subfiles under the dirty directory; the direct subfiles include subfiles and subdirectories under the dirty directory. When performing a file scan on a dirty directory, only the direct subfiles of the dirty directory are initially traversed without considering the corresponding files and directories under the direct subfiles. The reason is that when the initial dirty directory set is determined, all directories (including sub-directories) with changed timestamps are determined as dirty directories, and the direct subfiles and sub-directories under all dirty directories are scanned by the embodiment of the present invention, so that all directories or files with changed timestamps and newly added directories in the storage device can be covered only by scanning files and directories under the direct subfiles under each dirty directory.
In one possible implementation manner, first subfile information is obtained; the first subfile information comprises a first sub-modification timestamp of the direct subfile of the dirty directory in the scanned file list; acquiring second subfile information; the second subfile information comprises a second modified timestamp of the direct subfile of the dirty directory in the list of files to be scanned; comparing the first subfile information and the second subfile information of the direct subfiles with the same file name to determine the changed direct subfile. Specifically, when the electronic device traverses the direct subfiles in the dirty directory, the electronic device first executes step S6071 to obtain the first subfile information and the second subfile information. It should be noted that the first subfile information includes a last modification timestamp of the direct subfile of the dirty directory in the scanned file list stored in the target storage device and other information of the direct subfile scanning result. The second subfile information includes a current latest modification timestamp of the direct subfile of the dirty directory in the list of files to be scanned stored by the target storage device and other information of the direct subfile.
Next, the electronic device performs step S6072, and the electronic device compares the last modification timestamp of the direct subfile of the dirty directory having the same file name with the current latest modification timestamp. After the storage device is pulled out, if a user changes the direct subfile of the dirty directory stored in the storage device, the modification timestamp of the direct subfile changes. And comparing the last modification time stamp of the direct subfile with the current latest modification time stamp, and if the two time stamps are not consistent, judging that the direct subfile is modified. When the last modification timestamp of the direct subfile is consistent with the current latest modification timestamp, step S6073 is executed, and the electronic device continues to traverse other direct subfiles under the dirty directory. And when the last modification time stamp of the direct subfile is not consistent with the current latest modification time stamp, determining that the direct subfile is changed.
In one possible implementation manner, the determining the direct subfile which is changed includes determining the direct subfile to be a modified direct subfile; the modified direct subfile is a direct subfile with the same file name in the first subfile information and the second subfile information and inconsistent first sub-modification timestamp and the second sub-modification timestamp; determining that the direct subfile is a deleted direct subfile; the deleted immediate subfile is an immediate subfile that does not appear in the second subfile information and that appears in the first subfile information. Specifically, if the direct subfile with the same file name has the last modification timestamp and the latest current modification timestamp, and the two modification timestamps are not consistent, it indicates that after the storage device is disconnected, the user has modified the direct subfile, and determines the direct subfile as the modified direct subfile. When the direct subfile with the same file name only has the last modification time stamp but does not have the latest current modification time stamp, the direct subfile is deleted by the user after the storage device is disconnected, and the direct subfile is determined as the deleted direct subfile.
In one possible implementation, the information of the deleted immediate subfile in the scan result of the scanned file list is deleted. Specifically, when the electronic device finds that a direct subfile only has the last modification timestamp and does not have the latest current modification timestamp, the electronic device judges that the directory is deleted, and the electronic device deletes all information related to the direct subfile in the own database.
After the electronic device finds the changed direct subfile, it performs step S6074. In one possible implementation, whether the modified direct subfile is a directory is determined; if not, determining that the modified direct subfile is a modified subfile, acquiring the file attribute of the modified subfile, performing file analysis on the modified subfile, and updating the scanning result of the modified subfile in the scanning result of the scanned file list; if yes, determining that the modified direct sub-file is a modified sub-directory, acquiring the directory attribute of the modified sub-directory, setting the first sub-modification time stamp of the modified sub-directory as a negative number, and inserting the scanning result of the modified sub-directory into the scanning result of the scanned file list. Specifically, when the changed direct subfile in the dirty directory is found, the electronic device performs step S6075 to reparse the modified subfile, and then performs step S6076 to update the file scanning result of the modified subfile in the database of the electronic device. When the modified subdirectory is found, the electronic equipment executes S6077 to set the modification time stamp of the record of the modified subdirectory to a negative number, since normally the value of the modification time stamp is a number greater than 0, and setting it to a negative number represents marking the modified subdirectory as dirty. Then the electronic device performs step S6078 to insert the modified sub-directory into the scanning result, and then the electronic device performs step S6079 to determine whether the directory is successfully inserted, and if the directory is successfully inserted, the modified sub-directory is a new directory, and then performs step S60710 to add the modified sub-directory to the dirty directory set.
Step S608: updating the scanning result of the dirty catalogue at present.
Specifically, after traversing all the direct subfiles under the dirty directory, the electronic device changes the value of the last modification timestamp of the dirty directory to the value of the current latest modification timestamp, and finally updates the scanning result of the dirty directory in the database of the electronic device.
In one possible implementation, all scan result first modification timestamps within the time period T3- Δ T are set to negative numbers; the T3 is the time when the storage device offload message is received; and the delta T is a preset time period. Optionally, Δ T is a time difference obtained from different systems, that is, a time difference obtained from different operating systems running on the electronic device. Note that generally Δ T is an empirical value greater than or equal to (T3-T1), and T1 is the time when the storage device is disconnected from the electronic device. Completing the above steps S601-S608 enables faster scanning of the document. However, in the actual use process, it is found that when the external storage device is hot-plugged, some abnormal situations may exist in the returned result of the file system interface, which may cause inconsistency of the scanning result, and therefore, in this embodiment, an additional mechanism is designed to ensure the stability of file scanning. In the embodiment of the present invention, by setting the first modified timestamps of the directories in all the scanning results in the (T3- Δ T) time period to be negative numbers, when the electronic device scans the files again on the storage device, since the negative timestamps of the files are not necessarily consistent with any timestamps in the normal case, the files are determined to be dirty directories in the process of forming the dirty directories, and therefore the files are traversed and parsed again. By the method, the problem that inconsistent results are obtained by calling the file system interface when the storage device is subjected to hot plug and abnormal interruption can be solved, the stability of file scanning is improved, and the usability in an actual scene is improved.
It should be noted that the time when the external storage device is disconnected from the electronic device is denoted as T1, the time when the file system on the external storage device is unloaded on the Virtual File System (VFS) is T2, and the time when the electronic device receives the unload message is T3. Therefore, the scanning results written into the data blocks by the T3-T1 are effectively processed, and the problem of inconsistent scanning results can be avoided. In particular, the empirical time difference Δ T is obtained according to the different systems, it should be noted that generally Δ T is greater than or equal to (T3-T1), and the scanned directories within the time period T3- Δ T-T3 all have their first modified time stamps changed to negative, in such a way as to inform that these files are re-traversed and parsed the next time they are scanned.
In order to describe the file scanning method in the embodiment of the present application in more detail, the present embodiment will be described in detail with reference to the first application scenario, the schematic diagram of accessing file information for the first time in fig. 4, and the schematic diagram of accessing file information for the second time in fig. 5.
In the scene, the mobile hard disk corresponds to the storage device in the embodiment of the invention, and the smart screen corresponds to the electronic device in the embodiment of the invention. When the mobile hard disk is accessed to the intelligent screen for the first time, the mobile hard disk is mounted on the intelligent screen at first, so that an operating system of the intelligent screen can access files and directories in the mobile hard disk. After the mobile hard disk is mounted, first file information corresponding to the mobile hard disk is searched in a database according to a Universal Unique Identifier (UUID) of the mobile hard disk, and then an operating system obtains second file information, as shown in fig. 7A, fig. 7A is a schematic diagram of the first file information and the second file information of the mobile hard disk inserted into an intelligent screen for the first time, in the diagram, the first file information does not have first directory information, and second directory information of the second file information has an a directory, an X directory and a C directory under a root directory. The root directory in the second file information is added to the dirty directory set, typically the root directory is "/". At this time, only one dirty directory in the dirty directory set is the root directory "/".
Then, first subfile information and second subfile information under a root are obtained, as shown in fig. 7B, fig. 7B is a schematic diagram of direct subfile information when the storage device first accesses the root directory, the first subfile information in the diagram does not contain the direct subfile information, the second subfile information comprises a, a second modification timestamp of 8:00, X, a second modification timestamp of 5:00, and C, a second modification timestamp of 8: 00. A, X, C are obtained and are all directories, the second sub-modification time stamps of the three directories are set as negative numbers respectively, then the scanning results of the three directories are inserted into the scanning results of the scanned file list, the scanning results of the scanned file list are null because the mobile hard disk is inserted into the intelligent screen for the first time, and the scanning results of the three directories can be inserted into the scanning results of the scanned file list. Next, these three directories are added to the dirty directory set, and the second modified timestamp of the A directory is recorded as 8:00, the second modified timestamp of the X directory is recorded as 5:00, and the second modified timestamp of the C directory is recorded as 8: 00.
Then, first subfile information and second subfile information of direct subfiles of the A directory, the X directory and the C directory are obtained, as shown in FIG. 8A, FIG. 8A is a schematic diagram of the first subfile information and the second subfile information of the direct subfiles of the A directory, the first subfile information in the diagram does not have the direct subfile information, the second subfile information has B, the second sub-modification timestamp of which is 7:00, and C, the second sub-modification timestamp of which is 8: 00. And then obtaining that B has the file attribute, the intelligent screen acquires the file attribute of B and analyzes the file, then updates the scanning result of the scanned file list of the subfiles, and simultaneously updates the first sub-modification timestamp of the B file in the first subfile information to be 7: 00.
And then obtaining that the C has the directory attribute, the intelligent screen acquires the directory attribute of the C directory, sets the first sub-modification time stamp of the C directory as a negative number, inserts the C directory scanning result into the scanning result of the scanned file list, and fails to insert the C directory into the dirty directory set. After traversing the direct subfiles of the A-directory, the first modification timestamp of the A-directory is changed to 8: 00.
As shown in fig. 8B, fig. 8B is a schematic diagram of the first subfile information and the second subfile information of the direct subfile of the C directory, where the first subfile information does not include direct subfile information, and the second subfile information includes D whose second sub-modification timestamp is 8: 00. And then obtaining that D has a file attribute, the intelligent screen can obtain the file attribute of D and analyze the file, then updating the scanning result of the scanned file list of the subfiles, and simultaneously updating the first sub-modification timestamp of the D file in the first subfile information to be 8: 00. After traversing the direct subfiles of the C directory, the first modification timestamp of the C directory is changed to 8: 00.
As shown in fig. 8C, fig. 8C is a schematic diagram of the first subfile information and the second subfile information of the direct subfile of the X directory, where the first subfile information does not include direct subfile information, and the second subfile information includes Y and has a second sub-modification timestamp of 5: 00. And then obtaining that Y has the file attribute, the intelligent screen acquires the file attribute of Y and analyzes the file, then updates the scanning result of the scanned file list of the subfiles, and simultaneously updates the first sub-modification timestamp of the Y file in the first subfile information to be 5: 00. After traversing the direct subfiles of the X directory, the first modification timestamp of the X directory is changed to 5: 00.
As shown in fig. 9, when the mobile hard disk is inserted into the smart screen for the first time, and the smart screen finishes scanning the files of the mobile hard disk, the file scanning result of the scanned file list is stored in the database of the smart screen.
After the mobile hard disk is accessed to the intelligent screen again, the mobile hard disk is mounted on the intelligent screen at first, so that an operating system of the intelligent screen can access files and directories in the mobile hard disk. After the mobile hard disk is mounted, searching first file information corresponding to the mobile hard disk in a database according to a Universal Unique Identifier (UUID) of the mobile hard disk, and acquiring the first file information and the second file information by an operating system, as shown in fig. 10, fig. 10 is a schematic diagram of the first file information and the second file information of the mobile hard disk inserted into an intelligent screen again, wherein at the moment, first directory information in the first file information includes an a directory, an X directory and a C directory, and second directory information in the second file information includes an a directory, a C directory, an X directory and an E directory. And comparing the first modification time stamp with the second modification time stamp with the same file name, putting the directories with inconsistent time stamps into the dirty directory set, and putting the directory A and the directory C into the dirty directory set. At this time, there are two dirty directories in the dirty directory set, and during the traversal of the dirty directories, the dirty directories may be performed simultaneously without the precedence order, and then the a directory and the C directory are scanned simultaneously for example.
And taking the A directory and the C directory from the dirty directory set, comparing the first modification time stamps of the A directory and the C directory with the second modification time stamps to determine that the directory is the dirty directory at the moment A, C, and then respectively recording the second modification time stamp of the A directory to be 10:00 and the second modification time stamp of the C directory to be 9: 00.
As shown in fig. 11A, fig. 11A is a schematic diagram of first subfile information and second subfile information of direct subfiles of directories a and C, where the first subfile information includes B whose first sub-modification timestamp is 7:00, C whose first sub-modification timestamp is 8:00, the second subfile information includes B whose second sub-modification timestamp is 7:00, C whose second sub-modification timestamp is 9:00, and E whose second sub-modification timestamp is 10: 00. Comparing the first sub-modification timestamp and the second sub-modification timestamp of the direct sub-file with the same file name results in C, E that the first sub-modification timestamp and the second sub-modification timestamp are not consistent.
And then obtaining that C is a directory and E is a directory, setting first sub-modification time stamps of the two directories as negative numbers, and inserting the C directory and the E directory into a scanning result of a scanned file list, wherein the C directory is failed to be inserted because the C directory participates in a file scanning process of first access of the mobile hard disk. And compared with the last scanning result, the E directory is a newly added directory, so that the scanning result of the scanned file list can be successfully inserted, the E directory is added into the dirty directory set, and the first modification time stamp of the A directory is modified into the recorded second modification time stamp of 10: 00.
As shown in fig. 11B, fig. 11B is a schematic diagram of the first subfile information and the second subfile information of the direct subfile inserted into the directory C again, where the first subfile information has a first sub-modification timestamp of 8:00, and the second subfile information has a second sub-modification timestamp of 8: 30. And comparing the first sub-modification time stamp with the second sub-modification time stamp to obtain that the first sub-modification time stamp and the second sub-modification time stamp of D are inconsistent. And then obtaining that D is a file, the intelligent screen acquires the file attribute of D and analyzes the file, then updates the scanning result of the scanned file list of the subfiles, simultaneously updates the first sub-modification timestamp of the D file in the first subfile information to be 8:30, and modifies the first modification timestamp of the C directory to be the recorded second modification timestamp 9: 00.
And taking the E dirty directory from the dirty directory set, comparing a first modification time stamp and a second modification time stamp of the E directory to determine that the E directory is the dirty directory at the moment, and recording the second modification time stamp of the E as 10: 00.
As shown in fig. 11C, fig. 11C is a schematic diagram of first subfile information and second subfile information of an E-directory direct subfile, where the first subfile information in the diagram does not include direct subfile information, and the second subfile information includes F and has a second sub-modification timestamp of 9: 30. and comparing the first sub-modification time stamp with the second sub-modification time stamp of the F to obtain that the first sub-modification time stamp and the second sub-modification time stamp of the F are inconsistent. And then obtaining that F is a file, the intelligent screen acquires the file attribute of the F and analyzes the file, then updates the scanning result of the scanned file list of the subfiles, simultaneously updates the first sub-modification timestamp of the F file in the first subfile information to be 9:30, and modifies the first modification timestamp of the E directory to be 10:00 of the recorded second modification timestamp.
As shown in fig. 12, the storage device is re-accessed to the file scanning schematic diagram, and as shown in the figure, when the mobile hard disk is re-inserted into the smart screen, the smart screen completes the file scanning on the mobile hard disk, and then the file scanning result on the scanned file list is updated in the database of the smart screen.
In the practical test, in order to avoid the influence caused by file analysis, the file analysis is separated from the file scanning and is placed in the link needing analysis actually, so that the file analysis of the file scanning in the practical test is not time-consuming, and the optimization effect of the file traversal process is emphasized. The following example illustrates one specific test procedure and results.
Assume that a mobile hard disk (i.e., storage device) with a size of 4T is used in the test, and about 1500 directories and about 30000 files are stored in the mobile hard disk, and the total data is about 2T. When the mobile hard disk is attached to the electronic device, it takes about 110 seconds for the electronic device to traverse all the files, whereas the second traversal takes only about 6 seconds without modifying any of the files. When the modification ratio reaches about 10% on the storage device, the traversal takes about 21 seconds, and when the modification ratio reaches about 50%, the traversal time is about 57 seconds. It should be noted that all the test data are the average result of the integrated multiple measurements, and the modification ratio is the integrated result based on the directory modification ratio and the file modification ratio.
From the test results, by using the file scanning method of the embodiment of the invention, when the storage device is scanned again, the time for traversing the files is shortened by reducing the number of the traversed files, and the overall efficiency of file scanning is improved.
The method of the embodiments of the present invention is explained in detail above, and the related apparatus of the embodiments of the present invention is provided below.
Referring to fig. 13, fig. 13 is a schematic structural diagram of a file scanning apparatus according to an embodiment of the present invention, where the file scanning apparatus 130 may include a receiving module 1301, a first obtaining module 1302, a second obtaining module 1303, a processing module 1304, a deleting module 1305, and a recording module 1306, where details of each unit are described below.
A receiving module 1301, configured to receive a scan request for a target storage device;
a first obtaining module 1302, configured to obtain first file information; the first file information comprises first directory information of a scanned file list stored by the target storage device, the first directory information comprising a first modification timestamp for each directory of the scanned file list;
a second obtaining module 1303, configured to obtain second file information, where the second file information includes second directory information of a to-be-scanned file list stored in the target storage device, and the second directory information includes a second modification timestamp of each directory in the to-be-scanned file list;
a processing module 1304, configured to compare the first directory information with the second directory information, and generate a dirty directory set; the dirty directories in the dirty directory set are directories with file names in the first directory information and the second directory information which are the same and the first modification time stamps and the second modification time stamps are inconsistent or directories which appear in the second directory information and do not appear in the first directory information.
In a possible implementation manner, the first obtaining module 1302 is specifically configured to: and searching the first file information matched with the UUID according to the Universal Unique Identifier (UUID) of the storage equipment.
In a possible implementation manner, the processing module 1304 is specifically configured to: and adding a root directory of a list of files to be scanned stored by the storage device to the dirty directory set under the condition that no directory exists in the first directory information.
In one possible implementation, the apparatus further includes: a deleting module 1305, configured to delete the directory to be deleted from the first directory information and delete the scanning result corresponding to the directory to be deleted after the first directory information is compared with the second directory information; the directory to be deleted is a directory which appears in the first directory information and does not appear in the second directory information.
In a possible implementation manner, the processing module 1304 is specifically configured to: and when the dirty directory set is an empty set, ending the file scanning of the file list to be scanned.
In a possible implementation manner, the processing module 1304 is specifically configured to: and when the dirty directory set is a non-empty set, carrying out file scanning on the dirty directories in the dirty directory set.
In a possible implementation manner, the processing module 1304 and the deleting module 1305 are specifically configured to: a processing module 1304 for comparing the first modification timestamp and a second modification timestamp of a dirty directory in the dirty directory set; a deleting module 1305, configured to delete the dirty directory with the first modification timestamp consistent with the second modification timestamp from the dirty directory set.
In one possible implementation, the apparatus further includes: a recording module 1306, configured to record attribute information of the dirty directory; the attribute information includes a second modified timestamp for the dirty directory.
In a possible implementation manner, the processing module 1304 is specifically configured to: traversing all direct subfiles under the dirty directory; the direct subfiles include subfiles and subdirectories under the dirty directory.
In a possible implementation manner, the first obtaining module 1302, the second obtaining module 1303 and the processing module 1304 are specifically configured to: a first obtaining module 1302, configured to obtain first subfile information; the first subfile information comprises a first sub-modification timestamp of the direct subfile of the dirty directory in the scanned file list; a second obtaining module 1303, configured to obtain second sub-file information; the second subfile information comprises a second modified timestamp of the direct subfile of the dirty directory in the list of files to be scanned; a processing module 1304, configured to compare the first subfile information and the second subfile information of the direct subfiles with the same file name, and determine the changed direct subfile.
In a possible implementation manner, the processing module 1304 is specifically configured to: determining that the direct subfile is a modified direct subfile; the modified direct subfile is a direct subfile with the same file name in the first subfile information and the second subfile information and inconsistent first sub-modification timestamp and the second sub-modification timestamp; determining that the direct subfile is a deleted direct subfile; the deleted immediate subfile is an immediate subfile that does not appear in the second subfile information and that appears in the first subfile information.
In a possible implementation manner, the deleting module 1305 is specifically configured to: deleting the information of the deleted direct subfile in the scanning result of the scanned file list.
In a possible implementation manner, the processing module 1304 is specifically configured to: judging whether the modified direct subfile is a directory or not; if not, determining that the modified direct subfile is a modified subfile, acquiring the file attribute of the modified subfile, performing file analysis on the modified subfile, and updating the scanning result of the modified subfile in the scanning result of the scanned file list; if yes, determining that the modified direct sub-file is a modified sub-directory, acquiring the directory attribute of the modified sub-directory, setting the first sub-modification time stamp of the modified sub-directory as a negative number, and inserting the scanning result of the modified sub-directory into the scanning result of the scanned file list.
In a possible implementation manner, the processing module 1304 is specifically configured to:
determining that all of the direct subfiles of the dirty directory have been traversed; updating the scanning result of the dirty catalogue at present; the scan result of the current dirty directory includes a first modified timestamp for the dirty directory.
In a possible implementation manner, the processing module 1304 is specifically configured to: setting the first modification time stamps of all the scanning results in the time period T3-delta T as negative numbers; the T3 is the time when the storage device offload message is received; and the delta T is a preset time period. It should be noted that, for each functional module in the file scanning apparatus 130 described in the embodiment of the present invention, reference may be made to the related description of step S601 to step S608 in the method embodiment described in fig. 6A and fig. 6B, which is not described herein again.
An embodiment of the present invention further provides a computer storage medium, where the computer storage medium may store a program, and when the program is executed, the program includes some or all of the steps of any one of the file scanning methods described in the above method embodiments.
Embodiments of the present invention also provide a computer program, which includes instructions that, when executed by a computer, enable the computer to perform some or all of the steps of any of the file scanning methods.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required in this application.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, the above-described division of the units is only one type of division of logical functions, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of some interfaces, devices or units, and may be an electric or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit may be stored in a computer-readable storage medium if it is implemented in the form of a software functional unit and sold or used as a separate product. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like, and may specifically be a processor in the computer device) to execute all or part of the steps of the above-described method of the embodiments of the present application. The storage medium may include: a U-disk, a removable hard disk, a magnetic disk, an optical disk, a Read-Only Memory (ROM) or a Random Access Memory (RAM), and the like.
The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (30)

1. A method of scanning a document, comprising:
receiving a scanning request aiming at a target storage device, and acquiring first file information; the first file information comprises first directory information of a scanned file list stored by the target storage device, the first directory information comprising a first modification timestamp for each directory of the scanned file list;
acquiring second file information, wherein the second file information comprises second directory information of a to-be-scanned file list stored in the target storage device, and the second directory information comprises a second modification timestamp of each directory in the to-be-scanned file list;
comparing the first directory information with the second directory information to generate a dirty directory set; the dirty directories in the dirty directory set are directories with file names in the first directory information and the second directory information which are the same and the first modification time stamps and the second modification time stamps are inconsistent or directories which appear in the second directory information and do not appear in the first directory information.
2. The method of claim 1, wherein the obtaining the first file information comprises:
and searching the first file information matched with the UUID according to the Universal Unique Identifier (UUID) of the storage equipment.
3. The method of claim 1, wherein the generating a dirty directory set comprises:
and adding a root directory of a list of files to be scanned stored by the storage device to the dirty directory set under the condition that no directory exists in the first directory information.
4. The method of claim 1, wherein after comparing the first directory information to the second directory information, further comprising:
deleting the directory to be deleted from the first directory information, and deleting the scanning result corresponding to the directory to be deleted; the directory to be deleted is a directory which appears in the first directory information and does not appear in the second directory information.
5. The method of claim 1, wherein the method further comprises:
when the dirty directory set is an empty set, ending the file scanning of the file list to be scanned;
and when the dirty directory set is a non-empty set, carrying out file scanning on the dirty directories in the dirty directory set.
6. The method of claim 5, wherein the scanning files for dirty directories in the set of dirty directories comprises:
comparing the first modification timestamp and a second modification timestamp of a dirty directory in the set of dirty directories;
deleting the dirty directory from the set of dirty directories for which the first modification timestamp is consistent with the second modification timestamp.
7. The method of claim 1, wherein the method further comprises:
recording attribute information of the dirty directory; the attribute information includes a second modified timestamp for the dirty directory.
8. The method of claim 1, wherein the method further comprises:
traversing all direct subfiles under the dirty directory; the direct subfiles include subfiles and subdirectories under the dirty directory.
9. The method of claim 8, wherein the method further comprises:
acquiring first subfile information; the first subfile information comprises a first sub-modification timestamp of the direct subfile of the dirty directory in the scanned file list;
acquiring second subfile information; the second subfile information comprises a second modified timestamp of the direct subfile of the dirty directory in the list of files to be scanned;
comparing the first subfile information and the second subfile information of the direct subfiles with the same file name to determine the changed direct subfile.
10. The method of claim 9, wherein the determining the direct subfile that was altered comprises:
determining that the direct subfile is a modified direct subfile; the modified direct subfile is a direct subfile with the same file name in the first subfile information and the second subfile information and inconsistent first sub-modification timestamp and the second sub-modification timestamp;
determining that the direct subfile is a deleted direct subfile; the deleted immediate subfile is an immediate subfile that does not appear in the second subfile information and that appears in the first subfile information.
11. The method of claim 10, wherein the method further comprises:
deleting the information of the deleted direct subfile in the scanning result of the scanned file list.
12. The method of claim 10, wherein the method further comprises:
judging whether the modified direct subfile is a directory or not;
if not, determining that the modified direct subfile is a modified subfile, acquiring the file attribute of the modified subfile, performing file analysis on the modified subfile, and updating the scanning result of the modified subfile in the scanning result of the scanned file list;
if yes, determining that the modified direct sub-file is a modified sub-directory, acquiring the directory attribute of the modified sub-directory, setting the first sub-modification time stamp of the modified sub-directory as a negative number, and inserting the scanning result of the modified sub-directory into the scanning result of the scanned file list.
13. The method of claim 12, the inserting the scan result of the modified subdirectory into the scan result of the list of scanned files, further comprising:
judging whether the scanning result of the modified subdirectory is successfully inserted into the scanning result of the scanned file list or not;
in the event that the insertion is successful, adding the modified subdirectory to the set of dirty directories.
14. The method of any one of claims 1-13, further comprising:
determining that all of the direct subfiles of the dirty directory have been traversed;
updating the scanning result of the dirty catalogue at present; the scan result of the current dirty directory includes a first modified timestamp for the dirty directory.
15. The method of claim 1, wherein the method further comprises:
setting the first modification time stamps of the catalogues in all the scanning results in the time period T3-delta T as negative numbers; the T3 is the time when the storage device offload message is received; and the delta T is a preset time period.
16. A document scanning apparatus, comprising:
a receiving module, configured to receive a scan request for a target storage device;
the first acquisition module is used for acquiring first file information; the first file information comprises first directory information of a scanned file list stored by the target storage device, the first directory information comprising a first modification timestamp for each directory of the scanned file list;
a second obtaining module, configured to obtain second file information, where the second file information includes second directory information of a to-be-scanned file list stored in the target storage device, and the second directory information includes a second modification timestamp of each directory in the to-be-scanned file list;
the processing module is used for comparing the first directory information with the second directory information to generate a dirty directory set; the dirty directories in the dirty directory set are directories with file names in the first directory information and the second directory information which are the same and the first modification time stamps and the second modification time stamps are inconsistent or directories which appear in the second directory information and do not appear in the first directory information.
17. The apparatus of claim 16, wherein the first obtaining module is specifically configured to:
and searching the first file information matched with the UUID according to the Universal Unique Identifier (UUID) of the storage equipment.
18. The apparatus of claim 16, wherein the apparatus further comprises:
the deleting module is used for deleting the directory to be deleted from the first directory information and deleting the scanning result corresponding to the directory to be deleted after the first directory information is compared with the second directory information; the directory to be deleted is a directory which appears in the first directory information and does not appear in the second directory information.
19. The apparatus of claim 16, wherein the processing module is specifically configured to:
when the dirty directory set is an empty set, ending the file scanning of the file list to be scanned;
and when the dirty directory set is a non-empty set, carrying out file scanning on the dirty directories in the dirty directory set.
20. The apparatus of claim 19, wherein the processing module and the deleting module are specifically configured to:
a processing module to compare the first modification timestamp and a second modification timestamp for a dirty directory in the dirty directory set;
and the deleting module is used for deleting the dirty directory with the first modification timestamp consistent with the second modification timestamp from the dirty directory set.
21. The apparatus of claim 16, wherein the apparatus further comprises:
the recording module is used for recording the attribute information of the dirty directory; the attribute information includes a second modified timestamp for the dirty directory.
22. The apparatus of claim 16, wherein the processing module is specifically configured to:
traversing all direct subfiles under the dirty directory; the direct subfiles include subfiles and subdirectories under the dirty directory.
23. The apparatus of claim 22, wherein the first obtaining module, the second obtaining module, and the processing module are specifically configured to:
the first obtaining module is used for obtaining the first subfile information; the first subfile information comprises a first sub-modification timestamp of the direct subfile of the dirty directory in the scanned file list;
the second obtaining module is used for obtaining the information of the second subfile; the second subfile information comprises a second modified timestamp of the direct subfile of the dirty directory in the list of files to be scanned;
and the processing module is used for comparing the first subfile information and the second subfile information of the direct subfiles with the same file names to determine the changed direct subfile.
24. The apparatus of claim 23, wherein the processing module is specifically configured to:
determining that the direct subfile is a modified direct subfile; the modified direct subfile is a direct subfile with the same file name in the first subfile information and the second subfile information and inconsistent first sub-modification timestamp and the second sub-modification timestamp;
determining that the direct subfile is a deleted direct subfile; the deleted immediate subfile is an immediate subfile that does not appear in the second subfile information and that appears in the first subfile information.
25. The apparatus of claim 24, wherein the processing module is specifically configured to:
judging whether the modified direct subfile is a directory or not;
if not, determining that the modified direct subfile is a modified subfile, acquiring the file attribute of the modified subfile, performing file analysis on the modified subfile, and updating the scanning result of the modified subfile in the scanning result of the scanned file list;
if yes, determining that the modified direct sub-file is a modified sub-directory, acquiring the directory attribute of the modified sub-directory, setting the first sub-modification time stamp of the modified sub-directory as a negative number, and inserting the scanning result of the modified sub-directory into the scanning result of the scanned file list.
26. The apparatus of claim 25, wherein the processing module is specifically configured to:
determining that all of the direct subfiles of the dirty directory have been traversed;
updating the scanning result of the dirty catalogue at present; the scan result of the current dirty directory includes a first modified timestamp for the dirty directory.
27. The apparatus of claim 26, wherein the processing module is specifically configured to:
judging whether the scanning result of the modified subdirectory is successfully inserted into the scanning result of the scanned file list or not;
in the event that the insertion is successful, adding the modified subdirectory to the set of dirty directories.
28. The apparatus according to any one of claims 16 to 27, wherein the processing module is specifically configured to:
determining that all of the direct subfiles of the dirty directory have been traversed;
updating the scanning result of the dirty catalogue at present; the scan result of the current dirty directory includes a first modified timestamp for the dirty directory.
29. A computer storage medium, characterized in that it stores a computer program which, when executed by a processor, implements the method of any of the preceding claims 1-14.
30. A computer program, characterized in that the computer program comprises instructions which, when executed by a computer, cause the computer to carry out the method according to any one of claims 1-14.
CN202010890951.6A 2020-08-29 2020-08-29 File scanning method and related device Pending CN114116611A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010890951.6A CN114116611A (en) 2020-08-29 2020-08-29 File scanning method and related device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010890951.6A CN114116611A (en) 2020-08-29 2020-08-29 File scanning method and related device

Publications (1)

Publication Number Publication Date
CN114116611A true CN114116611A (en) 2022-03-01

Family

ID=80359860

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010890951.6A Pending CN114116611A (en) 2020-08-29 2020-08-29 File scanning method and related device

Country Status (1)

Country Link
CN (1) CN114116611A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114860656A (en) * 2022-04-12 2022-08-05 深圳市绿联科技股份有限公司 File scanning method and device, electronic equipment and storage medium
CN115599929A (en) * 2022-09-30 2023-01-13 荣耀终端有限公司(Cn) File management method and electronic equipment
WO2024051654A1 (en) * 2022-09-05 2024-03-14 华为技术有限公司 File processing method and electronic device

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114860656A (en) * 2022-04-12 2022-08-05 深圳市绿联科技股份有限公司 File scanning method and device, electronic equipment and storage medium
WO2024051654A1 (en) * 2022-09-05 2024-03-14 华为技术有限公司 File processing method and electronic device
CN115599929A (en) * 2022-09-30 2023-01-13 荣耀终端有限公司(Cn) File management method and electronic equipment
CN115599929B (en) * 2022-09-30 2023-08-04 荣耀终端有限公司 File management method and electronic equipment

Similar Documents

Publication Publication Date Title
WO2016070534A1 (en) Method and system for scanning local media file
CN100588236C (en) Data reproducing device and content management method
CN114116611A (en) File scanning method and related device
US10552384B2 (en) Synchronizing media files available from multiple sources
US8370402B2 (en) Dual representation of stored digital content
EP3495981B1 (en) Directory deletion method and device, and storage server
US20070192797A1 (en) Method of and apparatus for managing distributed contents
US8639661B2 (en) Supporting media content revert functionality across multiple devices
JP5870468B2 (en) Method and apparatus for managing images of mobile terminals
CN104737135B (en) The information processing terminal and synchronisation control means
US7809742B2 (en) Content management method, apparatus, and system
CN100530190C (en) Apparatus and method for processing information
US20120102076A1 (en) Information processing apparatus, information processing method, and program
US11250888B1 (en) Flash memory and method for storing and retrieving embedded audio video data
US20140136496A1 (en) System, method and non-transitory computer readable storage medium for supporting network file accessing and versioning with multiple protocols in a cloud storage server
WO2022156484A1 (en) Media data processing method and apparatus, and terminal device
WO2017096850A1 (en) File system synchronization method and device
KR20060128207A (en) Apparatus and method for automatically uploading contents file
CN113448946B (en) Data migration method and device and electronic equipment
KR100838806B1 (en) Multi-media information device network system
CN103246729A (en) Method and system for processing multi-media files of android mobile terminal
US20090150332A1 (en) Virtual file managing system and method for building system configuration and accessing file thereof
CN111399753B (en) Method and device for writing pictures
CN103648021A (en) Method for playing network video files from USB storage device
KR20050114410A (en) Method and apparatus for managing directory in the storage media and storing media thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination