WO2022228664A1 - Management server and method for file storage management - Google Patents

Management server and method for file storage management Download PDF

Info

Publication number
WO2022228664A1
WO2022228664A1 PCT/EP2021/061091 EP2021061091W WO2022228664A1 WO 2022228664 A1 WO2022228664 A1 WO 2022228664A1 EP 2021061091 W EP2021061091 W EP 2021061091W WO 2022228664 A1 WO2022228664 A1 WO 2022228664A1
Authority
WO
WIPO (PCT)
Prior art keywords
file
change
storage
list
given
Prior art date
Application number
PCT/EP2021/061091
Other languages
French (fr)
Inventor
Shahar SALZMAN
Assaf Natanzon
Asaf Yeger
Michael Gutman
Shmoolik Yosub
David Segal
Original Assignee
Huawei Technologies Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co., Ltd. filed Critical Huawei Technologies Co., Ltd.
Priority to PCT/EP2021/061091 priority Critical patent/WO2022228664A1/en
Priority to CN202180096730.4A priority patent/CN117099101A/en
Publication of WO2022228664A1 publication Critical patent/WO2022228664A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/562Static detection
    • G06F21/565Static detection by checking file integrity

Definitions

  • the present disclosure relates generally to the field of storage management and more specifically, to a management server and a method for managing storage of files in storage servers.
  • Computer viruses and malwares are generally attached to a file (or a computer program), by unauthorised users. Such viruses and malwares can replicate and spread after an initial execution of the corresponding file on a computer system. Such viruses and malwares are harmful and can destroy critical files and computer data, which can thereby slow down the computer system.
  • avoiding the computer viruses and the malware requires substantial computational resources which are allocated for file system scans and for monitoring of changes to specific files. However, many times such resources are not used efficiently, for example a duplicate file is scanned several times on several systems, or on multiple copies on the same computer system. Further, such scans impose a load on network attached storage (NAS) servers which are coupled to various computer systems, where access to each file needs to be synchronized since the network attached storage server supports concurrent access. Therefore, synchronization is usually performed by locking parts of the file system. However, locking parts of the file system reduces an overall performance of the network attached storage server, and thus, there exists a technical problem of scanning several times the duplicate files on the same and different computer systems.
  • NAS network attached
  • the present disclosure provides a management server and a method for managing storage of files in storage servers.
  • the present disclosure provides a solution to the existing problem of scanning of duplicate files, several times, on same and different computer systems.
  • An objective of the present disclosure is to provide a solution that overcomes at least partially the problems encountered in the prior art and provides an improved system and a method for file storage management to avoid the existing problem of scanning of duplicate files, several times, on same and different computer systems.
  • the present disclosure provides a management server for managing the storage of files in one or more storage server the management server comprising a catalog configured to store a first list of files stored in one or more monitored servers amongst the storage servers, the first list recording, for each listed file, an identifier for identifying the listed file, one or more storage locations for identifying the one or more monitored servers and the one or more locations on the identified monitored servers, where the listed file is stored, and a content information related to the content of the listed file, a change detection module configured to detect a change having occurred on a given file at one or more of the storage locations recorded in the first list for the given file, by determining the content information related to the content of the changed given file and comparing it with the content information recorded in the first list for the given file, a storage location selection module configured to select in the first list one storage location for a given file amongst the storage locations where a change has been detected for the given file, a scanning module configured to determine whether a detected change corresponds to a malicious change by scanning a given
  • the present disclosure provides a management server, which uses the catalog for a resource effective protection from computer virus or malwares.
  • the catalog includes a first list, which records different information related to files, therefore the management server is notified of changes on the storage servers very close to the actual change, as compared to the conventional approach.
  • the management server of the present disclosure can distinguish between a malicious change and a normal change. Therefore, the management server further identifies the behavioral patterns leading to installation of a computer virus/malware installation and avoids/wam of these patterns before actual infection.
  • the management server performs effective antivirus scans using the scanning module, which does not scan a duplicate file multiple times as in the conventional approach. In other words, the management server of the present disclosure may detect the computer virus/malware infection without actually performing an antivirus scan. As a result, the performance of the storage servers of the present disclosure is improved in comparison to storage servers used conventionally.
  • the storage location selection module is configured to select in the first list a server, amongst the monitored servers corresponding to the storage locations where the change has been detected for a given file, that is the least used one, and selecting in the first list one of the locations of the given file in the selected server.
  • the management server causes no unnecessary load on heavily used storage servers by selecting the storage server which is least use one.
  • the storage location selection module is configured to compare the number of reads and/or writes performed on the monitored servers corresponding to the storage locations where the change has been detected for the given file, in a given period of time, and to select in the first list one of the servers, corresponding to the storage locations where the change has been detected for the given file, for which the number of reads and/or writes is the smallest.
  • the management server does not cause an unnecessary load on heavily used storage servers. Moreover, the management server can identify how the change started and which storage server was initially affected.
  • the catalog is further configured to store a second list, of files stored in one or more monitored servers amongst the storage servers and that have been affected by a malicious change, the second list recording, for each listed file, the identifier for identifying the listed file, the one or more storage locations for identifying the one or more monitored servers and the locations on the corresponding monitored servers, where the listed file is stored, the content information related to the content of the listed file.
  • the second list is used to verify the content information of the affected files, and also to identify that all the listed affected files are duplicates of the files stored at one or more monitored servers.
  • the management server further comprising a threat recording module configured to record in the second list, when a detected change is determined as a malicious change, the identifier of the given file, the one or more storage locations of the given file where the change has been detected, and the content information related to the content of the changed given file.
  • the second list is used to verify the content information of the affected files, and also to identify that all the listed affected files are duplicates of the files stored at one or more monitored servers. Moreover, the content information recorded by the second list is used to save information about previously affected (or infected) files, therefore removing the need to perform an actual scan.
  • the catalog is further configured to record in the second list, for a given listed file, and for each of the storage location of the given listed file, the time of a detected change determined as malicious.
  • the management server can identify the first time of change (i.e., patient 0) for the given listed file based on the time of detected change determined as malicious.
  • the management server further comprising a sorting module configured to sort, for a given file listed in the second list, the corresponding storage locations by time of malicious change.
  • the management server can identify the initially affected file as well as the initially affected storage location.
  • the management server can also identify the first time of change (i.e., patient 0) for the given listed file.
  • the catalog is further configured to record in the first list, for each listed file, a protection mark for identifying the listed file as a protected or unprotected file, wherein the change detection module is further configured to determine if the changed given file is marked as protected in the first list and to trigger the storage selection module and the scanning module only if the changed given file is marked as protected in the first list.
  • the management server can perform effective protection of the resources, such as the storage servers.
  • the protection mask is also used to monitor changes to critical files (e.g. operating files) that should not change unless a system upgrade is performed.
  • the present disclosure provides a method of file storage management in a management server, said management server comprising one or more storage servers configured to store files, the method comprising configuring a catalog to store a first list, of files stored in one or more monitored servers amongst the storage servers, the first list recording, for each listed file, an identifier for identifying the listed file, one or more storage locations for identifying the one or more monitored servers and one or more locations on the identified monitored servers, where the listed file is stored, and a content information related to the content of the listed file, detecting a change having occurred on a given file at one or more of the storage locations recorded in the first list for the given file, by determining the content information related to the content of the changed given file and comparing it with the content information recorded in the first list for the given file, selecting in the first list one storage location for the given file amongst the storage locations where the change has been detected for the given file, determining whether the detected change corresponds to a malicious change by scanning the given file at the selected storage location, marking all the files
  • the disclosed method achieves all the technical effects of the management server of the present disclosure.
  • FIG. 1 A is a block diagram of a management server for managing the storage of files in one or more storage server, in accordance with an embodiment of the present disclosure
  • FIG. IB is a block diagram that illustrates various exemplary components of the management server, in accordance with an embodiment of the present disclosure
  • FIG. 1C is a block diagram that illustrates various exemplary components of the storage server, in accordance with an embodiment of the present disclosure
  • FIG. 2 is a flowchart of a method of file storage management in a management server, in accordance with an embodiment of the present disclosure.
  • FIGs. 3A, 3B, 3C, 3D and 3E are illustrations that depict a catalog cooperation mechanism, in accordance with various embodiment of the present disclosure.
  • an underlined number is employed to represent an item over which the underlined number is positioned or an item to which the underlined number is adjacent.
  • a non-underlined number relates to an item identified by a line linking the non- underlined number to the item.
  • the non-underlined number is used to identify a general item at which the arrow is pointing.
  • FIG. 1 A is a block diagram of a management server for managing the storage of files in one or more storage server, in accordance with an embodiment of the present disclosure.
  • FIG.1 A there is shown a block diagram 100A that comprises a management server 102, files 104A to 104N, one or more storage server 106A to 106N, a catalog 108, a first list 110, an identifier 112, one or more storage locations 114A to 114N, one or more locations 116A to 116N, and a content information 118.
  • a change detection module 120 There is further shown, a storage location selection module 122, a scanning module 124, and a marking module 126.
  • a second list 128, a threat recording module 130, and a sorting module 132 There is further shown.
  • the present disclosure provides a management server 102 for managing the storage of files 104 A to 104N in one or more storage server 106 A to 106N the management server 102 comprising: a catalog 108 configured to store a first list 110 of files 104A to 104N stored in one or more monitored servers amongst the one or more storage servers 106 A to 106N, the first list 110 recording, for each listed file, an identifier 112 for identifying the listed file, one or more storage locations 114A to 114N for identifying the one or more monitored servers and one or more locations 116A to 116N on the identified monitored servers, where the listed file is stored, and a content information 118 related to the content of the listed file, a change detection module 120 configured to detect a change having occurred on a given file at one or more of the storage locations 114A to 114N recorded in the first list 110 for the given file, by determining the content information 118 related to the content of the changed given file and comparing it with the content information 118 recorded in the
  • the management server 102 include a suitable logic, circuitry, interfaces and/or code that is configured for managing the storage of files 104A to 104N in one or more storage server 106A to 106N.
  • the management server 102 is further configured to monitor changes in the files 104A to 104N also referred to as critical files. Further, the management server 102 performs improved antivirus scans which do not scan duplicate files in the files 104A to 104N.
  • the components of the management server 102 can be spread on different machines at different locations. Examples of the management server 102 includes but not limited to a management system, a server, computer server, and the like.
  • the management server 102 is used to manage the files 104A to 104N stored at one or more storage servers 106A to 106, and also to provide an effective protection echo-system.
  • the files 104A to 104N corresponds to executable files which are stored at the one or more storage server 106A to 106N.
  • the files 104A to 104N may correspond to files associated with one or more virtual or physical machines connected to the one or more storage server 106A to 106N.
  • Examples of the files 104A to 104N includes but not limited to a system32 file, program files, and the like.
  • One or more storage server 106A to 106N include suitable logic, circuitry, interfaces and/or code that is configured to receive and store the files 104A to 104N associated with one or more virtual or physical machines.
  • the one or more storage server 106A to 106N may store files 104A to 104N along with an identifier associated with the virtual or physical machines.
  • Example of the one or more storage server 106A to 106N includes but not limited to a network attached storage (NAS) server.
  • the one or more storage server 106A to 106N may be storage servers of a single organization.
  • the catalog 108 holds an organization-wide view of file system, even those residing on different storage systems, such as the one or more storage server 106A to 106N and holds information about the first list 110.
  • the catalog 108 may also be referred to as a global unstructured data catalog.
  • the first list 110 holds a list of files 104A to 104N and information associated with such files 104A to 104N.
  • the first list 110 may also be referred to as a duplicate list, duplicate database, and the like.
  • the first list 110 holds information about file contents (e.g., a strong hash), change information, file attributes (e.g. system file), and the like.
  • the first list 110 stores information of the files 104A to 104N that are the same (i.e. duplicate) either within a storage server, such as the storage server 106A or between one or more storage server 106 A to 106N.
  • the identifier 112 is used to identify the listed file from the first list 110.
  • the identifier 112 may also be referred to as a hash key.
  • the identifier 112A further includes a plurality of identifiers, such as identifier 112A to 112N.
  • the one or more storage locations 114A to 114N corresponds to locations that are stored for identifying the one or more monitored servers amongst the storage server 106A to 106N.
  • the one or more locations 116A to 116N corresponds to locations that are stored for identifying the listed files in the identified monitored servers.
  • the one or more locations 116A to 116N may correspond to an address of a virtual machine or files of virtual machines within the one or more storage server 106A to 106N.
  • the content information 118 corresponds to information associated with the content of the files 104A to 104N, such as a text, picture, video, data, and the like that describes the content of the files 104A to 104N.
  • the content information 118 includes information of all the files 104A to 104N that are affected, a hash (or an identifier) of the affected files 104A to 104N, an original version of the files 104A to 104N, and time of change of each file, and a behavioral pattern that is leading to a change of the files 104A to 104N.
  • the change detection module 120 is a software module that is used for monitoring changes in specific files 104A to 104N, such as operating system files that should not change unless a system upgrade is performed. Such specific files may also be referred to as critical files.
  • the change detection module 120 may also be implemented as a circuit in the management server
  • the storage location selection module 122 is a software module that is used for the selection of the one or more storage locations 114A to 114N in the first list 110 for a given file, where a change has been detected for the given file by the change detection module 120.
  • the storage location selection module 122 may also be implemented as a circuit in the management server
  • the scanning module 124 is a software module used to perform effective antivirus scans which do not scan a duplicate file more than once.
  • the scanning module 124 corresponds to an antivirus that is used for detection of computer virus/malware infection without actually performing the antivirus scan.
  • the scanning module 124 may be also implemented as a circuit in the management server 102.
  • the marking module 126 is a software module that is used to mark the changed files 104A to 104N based on the detected change (e.g., malicious change or normal change) in the files 104A to 104N. In other words, the marking module 126 is used to mark the changed files 104A to 104N which are having virus or malware.
  • the marking module 126 may also be implemented as a circuit in the management server 102.
  • the second list 128 holds the same information as in the first list 110 (or duplicate database) but with a long retention only for files 104A to 104N positively identified in the past as threats.
  • the second list 128 may also be referred to as a threat list.
  • the threat recording module 130 is a software module that is used to record the information related to the files 104A to 104N which are positively identified in the past as threats.
  • the sorting module 132 is a software module that is used to sort the storage locations 114A to 114N based on the time of malicious change in the files 104A to 104N.
  • Each of the threat recording module 130 and the sorting module 132 may be implemented as a circuit in the management server 102.
  • the catalog 108 is configured to store the first list 110 of the files 104A to 104N stored in one or more monitored servers amongst the storage servers 106A to 106N.
  • the first list 110 comprises recording, for each listed file, the identifier 112 for identifying the listed file, the one or more storage locations 114A to 114N for identifying the one or more monitored servers and the one or more locations 116A to 116N on the identified monitored servers, where the listed file is stored.
  • the first list 110 further comprises recording, the content information 118 related to the content of the listed file.
  • the catalog 108 of the management server 102 stores a duplicate file database, such as the first list 110 of the files 104A to 104N for managing the storage of the files 104A to 104N in one or more monitored servers.
  • the catalog 108 of the management server 102 manages the storage of the files 104A to 104N at the storage server 106A and the storage server 106B.
  • the first list 110 of the catalog 108 is configured to record (or store) the content information 118 related to the content of the listed file so as to manage each listed file 104A and 104N.
  • the content information 118 recorded by the first list 110 of the catalog 108 is used by the management server 102 to track each listed file 104 A and 104N of the one or more storage server 106A to 106N. Thereafter, one or more storage locations 114A to 114N recorded by the first list 110 of the catalog 108 are used by the catalog 108 to identify and track the monitored servers, such as the storage server 106 A and the storage server 106B. The one or more storage locations 114A to 114N recorded by the first list 110 are further used by the catalog 108 to identify (and locate) the one or more locations 116A to 116N on the identified monitored servers, so as to determine the exact location of the listed file.
  • the storage location 114A recorded by the first list 110 is used by the catalog 108 to identify the storage server 106A and the location 116A is used to identify (i.e., locate) the file 104A on the identified storage server 106A.
  • the storage location 114B recorded by the first list 110 is used by the catalog 108 to identify the subsequent storage server, such as the storage server 106B and the location 116B of the subsequent files.
  • the catalog 108 uses the identifier 112 of the first list 110 (i.e., duplicate file database) to verify the content information 118 and also to identify that all the listed files are duplicates of the files 104A to 104N, which are stored at one or more location 116A to 116N.
  • the first list 110 of the catalog 108 records the identifier 112 to identify the duplicates for each listed file of the first list 110 from the files 104A to 104N, which are stored at one or more monitored servers.
  • the identifier 112A recorded on the first list 110 is used to identify that all the listed files are duplicates of the files 104A to 104N, which are stored at one or more monitored servers.
  • the identifier 112A recorded on the first list 110 is used to identify that few of the listed files are duplicates of the files 104A to 104N, while the identifier 112B is recorded on the first list 110 to identify that all the subsequent files are duplicates of the files 104A to 104N.
  • many storage systems have a duplicate database (either in a file-level or a block-level), but the catalog 108 has a organization-wide database with the first list 110, which allows both organization-wide detections, and effective selection of the files 104A to 104N for scanning (e.g., scans in a conventional network file system can cause a disturbance in conventional network file system operations).
  • the content information 118 recorded by the first list 110 is used to save information about previously infected files 104A to 104N, therefore removing the need to perform an actual scan.
  • the management server 102 further comprises the change detection module 120 configured to detect a change having occurred on a given file at one or more of the storage locations 114A to 114N recorded in the first list 110 for the given file, by determining the content information 118 related to the content of the changed given file and comparing it with the content information 118 recorded in the first list 110 for the given file.
  • the first list 110 records the content information 118 of the given file, such as the file 104A, which is the same either within one storage location, such as the storage location 114A, or between one or more of the storage locations 114A to 114N.
  • the change detection module 120 compares the content of the changed given file 104A with the content information 118 recorded by the first list 110, so as to detect the change on the file 104A.
  • the catalog 108 (or global unstructured data catalog) is notified of changes on the given file (or a file system) very close to the actual change.
  • the change detection module 120 need not compare the content information 118 at the subsequent storage location of the subsequent storage server. This is done as the first list 110 of the catalog 108 tracks changes to the given file, and stores that there has not been a change since the last comparison.
  • overall performance is improved and resources are efficiently utilized in the present disclosure.
  • the management server 102 further comprises the storage location selection module 122 configured to select in the first list 110 one storage location for a given file amongst the storage locations 114A to 114N where a change has been detected for the given file.
  • one or more storage servers 106A to 106N are running with a same operating system at one or more storage locations 114A to 114N.
  • each storage server 106A to 106N has the given file, such as the file 104A (e.g., a system32 file).
  • the given file may be located in each of the storage server 106A to 106N.
  • the storage location selection module 122 of the management server 102 is used to select one storage location from the first list 110 based on the detected change in the given file. For example, the storage location selection module 122 selects the storage location 114A on the storage server 106A as recorded by the first list 110, where the change detection module 120 detects the change in the given file. Beneficially, as compared to the conventional approach, the storage location selection module 122 is not limited to one storage server, such as the storage server 106A. Further, the first list 110 of the catalog 108 is used by the storage location selection module 122 to determine the exact location of the changed given file.
  • the management server 102 further comprises the scanning module 124 configured to determine whether a detected change corresponds to a malicious change by scanning a given file for which a change is detected, at a selected storage location of the given file.
  • a virus has changed the content information 118 of the given file, such as the file 104A (e.g., a system32 file) on one or more of the storage server 106A to 106N (or on several of the virtual machines).
  • the change is detected by the change detection module 120 in the first list 110.
  • the scanning module 124 (or an antivirus) scans the changed given file at the selected storage location of the changed given file, to determine if the change detected by the change detection module 120 corresponds to a malicious change or not.
  • the scanning module 124 scan the changed file 104A at the storage location 114A, and determine if the change in the changed file 104A is a malicious change or not. Moreover, next time if any changed given file shows up, the management server 102 does not need to schedule the scanning module 124 (or an antivirus scan), since the catalog 108 of the management server 102 already holds the content information 118 in its database that such file is an affected version of the given file. Beneficially, as compared to the conventional approach, the scanning module 124 performs the effective scans (or antivirus scans) which do not scan a duplicate file more than once.
  • a user of the management server 102 specifies an extent of the scanning module 124 (e.g., scope of virtual machine/ entire storage servers 106A to 106N/ entire organization) on the storage servers 106A to 106N (or CPU resources) on which the scan can run.
  • the user further requests load balancing (or an automatic load balancing) on the monitored servers, such as 20% load balancing on the storage server 106A, 40% load balancing on the storage server 106B, and 40% load balancing on the subsequent storage server.
  • load balancing or an automatic load balancing
  • each file is marked as scanned (e.g., with a version used for the scan).
  • the initial map may change in real-time as few of the storage servers 106A to 106N may be busier than others. For example, if in case the storage server 106A and the storage server 106B are busier as compared to subsequent servers, then, the scanning process of the scanning module 124 will be slower on the storage server 106A and the storage server 106B as compared to subsequent storage servers.
  • the user or a processor of the management server 102 is notified of the results of the scanning module 124.
  • a patch is applied to the scanning module 124 (e.g., antivirus getting updated with virus/malware information and a rescan is needed).
  • the catalog 108 balance a file system scan between multiple file systems, such as the storage servers 106A to 106N, such that duplicate files in the files 104A to 104N are only scanned once.
  • the management server 102 further comprises the marking module 126 to mark all the files 104A to 104N, stored at the locations 116A to 116N where a change has been detected for the given file, as valid if the detected change is not determined as being a malicious change, and as affected if the detected change is determined as being a malicious change.
  • the marking module 126 is used to mark all the files 104A to 104N of the first list 110 based on the results of the scanning module 124. For example, if the scanning module 124 detects that the detected change in the file 104A stored at the location 116A is not the malicious change, then the marking module 126 marks the given file as valid (or a valid file).
  • the scanning module 124 detects that the detected change in the given file stored at the location 116A is not the malicious change, then the marking module 126 marks the given file as affected (or affected file).
  • the catalog 108 saves the scan version, and an informed decision on which files need to be scanned can be made.
  • the catalog 108 creates a resource-effective company-wide (or organization-wide) computer virus/malware protection framework. Further, the catalog 108 balances a file system scan between multiple file systems such that duplicate files are only scanned once.
  • one or more of the storage servers 106A to 106N run a number of operating systems at one or more of the storage locations 114A to 114N (or on different virtual machines).
  • the files 104A to 104N stored at one or more of the storage locations 114A to 114N are the same for each of the storage server 106 A to 106N, as the operating systems are also the same for each of the storage server 106A to 106N.
  • the content information 118 recorded by the first list 110 is different for each of the storage server 106 A to 106N.
  • the pictures directory is different for each of the storage server 106A to 106N as every one of the users saves his/her pictures there, so the content information 118 is different for each storage server 106A to 106N. Therefore, in such a case, the content information 118 recorded by the first list 110 is scanned for each storage server 106 A to 106N, so that the load of the scan does not affect a single storage system.
  • a compute (or computation) required for the scan is much less than a compute which is required in the conventional approach (i.e., if every conventional virtual machine scanned its own conventional files).
  • the storage location selection module 122 is configured to select in the first list 110 a storage server, amongst the monitored servers corresponding to the storage locations 114A to 114N where the change has been detected for a given file, that is the least used one, and selecting in the first list 110 one of the locations 116A to 116N of the given file in the selected server.
  • the change detection module 120 detects a change in the given file at one or more storage locations 114A to 114N of the first list 110.
  • one or more storage locations 114A to 114N of the first list 110 are used by the storage location selection module 122 to select the least used storage server among the monitored servers, such as the storage server 106A, where the change has been detected by the change detection module 120 for the given file. Moreover, one or more storage locations 114A to 114N of the first list 110 are further used by the storage location selection module 122 to select one of the locations 116A to 116N of the given file in the selected server, where the change has been detected for the given file. For example, the storage location 114A of the first list 110 is used by the storage location selection module 122 to select the location 116A of the file 104A in the storage server 106A, where the change has been detected for the file 104A.
  • the scanning module 124 is used by the catalog 108 to schedule an immediate scan of the changed file on the selected location of the least used storage server.
  • the catalog 108 schedules an immediate antivirus scan on a machine that is not regularly used, and on a copy of the file which resides on the storage server which is the file system least used.
  • the catalog 108 schedules a virus scan using the scanning module 124 on a specific virtual machine, and on a specific location, (e.g. run scan on dev.ABC.not frequently used on respective directory such as C: ⁇ Windows ⁇ System32).
  • scan is probably the most efficient scan possible since it does not affect production or heavily used development environment, and will not cause an unnecessary load on heavily used network file systems, such as the storage server other than the selected least used storage servers.
  • the storage location selection module 122 is configured to compare the number of reads and/or writes performed on the monitored servers corresponding to the storage locations 114A to 114N where the change has been detected for the given file, in a given period of time, and to select in the first list 110 one of the servers, corresponding to the storage locations 114A to 114N where the change has been detected for the given file, for which the number of reads and/or writes is the smallest.
  • the change detection module 120 detects a change on the given file in a given period of time on the monitored servers.
  • one or more storage locations 114A to 114N of the first list 110 are used by the storage location selection module 122 to identify the monitored servers, where the change has been detected by the change detection module 120 in a given time period for the given file.
  • the storage locations 114A and 114B of the first list 110 are used by the storage location selection module 122 to identify the storage server 106A and the storage server 106B, where the change has been detected.
  • the storage location selection module 122 of the management server 102 compares the number of reads and/or writes performed in a given time period on the monitored servers.
  • the storage locations 114A to 114N are used by the storage location selection module 122 to select one of the monitored servers for which the number of reads and/or writes of the changed given file is the smallest, to identify how the change started and which storage server was initially affected. For example, if the number of reads and/or writes of the changed file 104A is small at the storage server 106A as compared to the storage server 106B, then the storage location selection module 122 selects the storage server 106A, which means that the file 104A was initially changed at the storage server 106A. In other words, the smallest number of reads and/or writes are used by the storage location selection module 122 to identify which virtual machine was the initial machine affected, and a process of the spread of the virus or malware.
  • file access time allows a spread timeline to be created making a post-mortem analysis of the spread of the virus or malware, which is used to identify flaws in network security. Further, the spread timeline is used to locate the first system to be affected (i.e. patient zero), which is beneficial to find the initial infection which is usually different than the method of spreading through the network.
  • the catalog 108 is further configured to store a second list 128, of files 104A to 104N stored in one or more monitored servers amongst the storage servers 106A to 106N and that have been affected by a malicious change.
  • the second list recording, for each listed file, the identifier 112 for identifying the listed file, the one or more storage locations 114A to 114N for identifying the one or more monitored servers and the locations on the corresponding monitored servers, where the listed file is stored, the content information 118 related to the content of the listed file.
  • the scanning module 124 determines that the detected change corresponds to the malicious change at one or more monitored servers.
  • the catalog 108 stores a threat database, such as the second list 128 of the files 104A to 104N that have been affected by the malicious change, so as to manage the affected files.
  • the second list 128 of the catalog 108 further records (or stores) the content information 118 related to the content of the listed file so as to manage the affected files 104A to 104N at one or more monitored servers.
  • one or more storage locations 114A to 114N recorded by the second list 128 are used by the catalog 108 to identify and track the content information 118 of the monitored servers, such as the storage server 106 A and the storage server 106B.
  • the one or more storage locations 114A to 114N are further used by the catalog 108 to identify (and locate) the one or more locations 116A to 116N on the identified monitored servers, so as to determine the exact location of the affected files 104A to 104N. Thereafter, the catalog 108 uses the identifier 112 of the second list 128 (i.e., threat database) to verify the content information 118 and also to identify that all the listed files are duplicates of the files 104A to 104N, which are stored at one or more locations 116A to 116N of the monitored servers. In other words, the second list 128 of the catalog 108 records the identifier 112 to identify the duplicates for each listed file of the second list 128 from the files 104A to 104N stored at one or more storage server 106 A to 106N.
  • the second list 128 of the catalog 108 records the identifier 112 to identify the duplicates for each listed file of the second list 128 from the files 104A to 104N stored at one or more storage
  • the management server 102 further comprises a threat recording module 130 configured to record in the second list 128, when a detected change is determined as a malicious change, the identifier of the given file, the one or more storage locations 114A to 114N of the given file where the change has been detected, and the content information 118 related to the content of the changed given file.
  • the second list 128 (or threat database) holds the same information as in the first list 110 (or duplicate database) but with a long retention only for the files 104A to 104N positively identified in the past as threats. Therefore, the threat recording module 130 is used by the management server 102 to record a different information into the second list 128 in comparison to the first list 110.
  • the information may include the time of change, the identifier 112 (or a hash value), the content information 118, and one or more storage locations 114A to 114N of the given file where the change has been detected.
  • the content information 118 recorded by the second list 128 of the catalog 108 is used by the management server 102 to track each affected file 104A and 104N at the one or more storage server 106A to 106N.
  • the content information 118 recorded by the second list 128 is used to save information about previously affected (or infected) files 104A to 104N, therefore removing the need to perform an actual scan.
  • the catalog 108 of the management server 102 already has the content information 118 of all the listed files that are affected.
  • the catalog 108 further has information related to an original version and time of change of each affected file, a behavioral pattern leading to the change (e.g., file ⁇ /Downloads/I_am_a_malware.pdf added three seconds before the change), and an identifier (or a hash) of the affected file.
  • the management server 102 can either proactively “fix” all the affected files 104A to 104N if configured to do so or send a warning to a processor of the management server 102 with all the content information 118.
  • such information is used by the storage location selection module 122 of the management server 102 to receive the information for the location (or virtual machine) of the storage server, which was initially affected, and a process of start of the change (or spread) (e.g. download and open of a specific type of file).
  • the catalog 108 takes actions or warns that a virus may be present.
  • the affected files show up, it does not need to schedule an antivirus scan, since it already holds information in its database that this file is an affected version of the duplicate file of the first list 110.
  • the catalog 108 is further configured to record in the second list 128, for a given listed file, and for each of the storage location 114A to 114N of the given listed file, the time of a detected change determined as malicious.
  • the catalog 108 records the time at which the malicious change is detected for a given listed file and also records the time at which the malicious change is detected at each of the storage locations 114A to 114N of the given listed file.
  • recording the time of the detected malicious change is used by the catalog 108 to identify the initially affected file as well as the initially affected storage location.
  • the management server 102 further comprises a sorting module 132 configured to sort, for a given file listed in the second list 128, the corresponding storage locations 114A to 114N by time of malicious change.
  • the sorting module 132 is able to sort the storage location 114A to 114N of the given listed file by time of malicious change in the given listed file at each of the storage location 114A to 114N.
  • the management server 102 identifies the first time of change (i.e., patient 0) for the given listed file.
  • the catalog 108 is further configured to record in the first list 110, for each listed file, a protection mark for identifying the listed file as a protected or unprotected file, wherein the change detection module 120 is further configured to determine if the changed given file is marked as protected in the first list 110 and to trigger the storage location selection module 122 and the scanning module 124 only if the changed given file is marked as protected in the first list 110.
  • the catalog 108 monitors changes to critical files, such as operating files that should not change (unless a system upgrade is performed). Therefore, for such files, the protection mark is recorded by the catalog 108 in the first list 110 for each listed file to identify the protected and unprotected files.
  • the change detection module 120 triggers the storage location selection module 122 and the scanning module 124, to protect the corresponding changed file.
  • the catalog 108 starts actions to verify if this change is legitimate or if it is malicious.
  • the only action required is marking files that are not prone to change, such as system files.
  • This can be done explicitly in the catalog 108 code, and built over time using an artificial intelligence system that learns files 104A to 104N that are affected by the different computer virus/malware.
  • the following steps are used by the catalog 108 for the protection triggered by the change of file as a normal update of the catalog 108:
  • affected files can be restored to their previous values, and any additional system clean ups can be done automatically (e.g. some viruses can be disabled by creating a file with a certain name, or possibly some additional files may need to be removed).
  • the protection mask is used by the catalog 108 of the management server 102 to perform effective protection of the organization resources, such as the storage servers 106A to 106N.
  • the protection mask is also used to monitor changes in critical files (e.g. operating files) that should not change unless a system upgrade is performed.
  • the present disclosure provides a management server 102, which uses the catalog 108 for a resource effective protection from computer virus or malwares.
  • the catalog 108 includes the first list 110, which records different information related to files 104A to 104N, therefore the management server 102 is notified of changes on the storage servers 106A to 106N very close to the actual change.
  • the management server 102 can distinguish between the malicious change and the normal change.
  • the management server 102 further identifies the behavioral patterns leading to the installation of a computer virus/malware installation and avoids/wam of these patterns before actual infection. Therefore, the management server 102 performs effective antivirus scans using the scanning module 124, which does not scan a duplicate file more than once.
  • FIG. IB is a block diagram that illustrates various exemplary components of the management server, in accordance with an embodiment of the present disclosure. With reference to FIG. IB, there is shown a block diagram 100B that illustrates the management server 102.
  • the management server 102 includes a first processor 134, a first transceiver 136, and a first memory 138.
  • the first processor 134 may be communicatively coupled to the first transceiver 136 and the first memory 138.
  • the management server 102 is coupled to the storage server 106A via the communication network 140. There is further shown the change detection module 120, the storage location selection module 122, the scanning module 124, and the marking module 126. Optionally, in some implementation, the threat recording module 130 and the sorting module 132 may also be provided. All such modules may be communicatively coupled to each other and to the first processor 134 of the management server 102.
  • the first processor 134 is configured to manage the files 104A to 104N stored at the trusted servers 104.
  • the first processor 134 may perform all the functionalities of the management server 102.
  • the first processor 134 may be a general-purpose processor.
  • Other examples of the first processor 134 may include, but is not limited to a microprocessor, a microcontroller, a complex instruction set computing (CISC) processor, an application-specific integrated circuit (ASIC) processor, a reduced instruction set (RISC) processor, a very long instruction word (VLIW) processor, a central processing unit (CPU), a state machine, a data processing unit, and other processors or control circuitry.
  • the first processor 134 may refer to one or more individual processors, processing devices, a processing unit that is part of a machine, such as the management server 102.
  • the first transceiver 136 includes suitable logic, circuitry, and interfaces that may be configured to communicate with one or more external devices, such as the server 106A.
  • Examples of the first transceiver 136 may include, but is not limited to, an antenna, a telematics unit, a radio frequency (RF) transceiver, one or more amplifiers, one or more oscillators, a digital signal processor, a coder-decoder (CODEC) chipset, and/or a subscriber identity module (SIM) card.
  • RF radio frequency
  • CODEC coder-decoder
  • SIM subscriber identity module
  • the first memory 138 refers to a primary storage system of the management server 102.
  • the first memory 138 includes suitable logic, circuitry, and interfaces that may be configured to store the catalog 108. Examples of implementation of the first memory 138 may include, but are not limited to, Electrically Erasable Programmable Read-Only Memory (EEPROM), Random Access Memory (RAM), Read Only Memory (ROM), Hard Disk Drive (HDD), Flash memory, Solid-State Drive (SSD), and/or CPU cache memory.
  • the first memory 138 may store an operating system and/or other program products (including one or more operation algorithms) to operate the management server 102.
  • the first processor 134 of the management server 102 is configured for managing the storage of files 104 A to 104N in one or more storage server 106 A to 106N.
  • the first processor 134 is further configured to store, into the catalog 108, the first list 110 of the files 104 A to 104N stored in one or more monitored servers amongst the storage servers 106 A to
  • FIG. 1C is a block diagram that illustrates various exemplary components of the storage server, in accordance with an embodiment of the present disclosure.
  • a block diagram lOOC of the storage server 106A amongst the one or more storage servers 106A to 106N.
  • the storage server 106A further includes a second processor 142, a second transceiver 144, and a second memory 146.
  • the second processor 142 may be communicatively coupled to the second transceiver 144 and the second memory 146.
  • the storage server 106A is communicatively coupled to the management server 102 via the communication network 140.
  • the second processor 142 is configured to execute the files 104A to 104N stored in the second memory 146.
  • the second processor 142 may be a general-purpose processor.
  • Other examples of the second processor 142 may include, but is not limited to a microprocessor, a microcontroller, a complex instruction set computing (CISC) processor, an application- specific integrated circuit (ASIC) processor, a reduced instruction set (RISC) processor, a very long instruction word (VLIW) processor, a central processing unit (CPU), a state machine, a data processing unit, and other processors or control circuitry.
  • the second processor 142 of the storage server 106A is configured to store the files 104A to 104N in the second memory 146 of the storage server 106A.
  • the second transceiver 144 includes suitable logic, circuitry, and interfaces that may be configured to communicate with one or more external devices, such as the management server 102.
  • Examples of the second transceiver 144 may include, but is not limited to, an antenna, a telematics unit, a radio frequency (RF) transceiver, one or more amplifiers, one or more oscillators, a digital signal processor, a coder-decoder (CODEC) chipset, and/or a subscriber identity module (SIM) card.
  • the second memory 146 refers to a primary storage system of the storage server 106A.
  • Examples of implementation of the second memory 146 may include, but are not limited to, Electrically Erasable Programmable Read-Only Memory (EEPROM), Random Access Memory (RAM), Read Only Memory (ROM), Hard Disk Drive (HDD), Flash memory, Solid- State Drive (SSD), and/or CPU cache memory.
  • the second memory 146 may store an operating system and/or other program products (including one or more operation algorithms) to operate the storage server 106A.
  • FIG. 2 is a flowchart of a method of file storage management in a management server, in accordance with an embodiment of the present disclosure.
  • FIG. 2 is described in conjunction with elements from FIG. 1 A.
  • FIG.2 there is shown a method 200.
  • the method 200 includes steps 202, 204, 206, 208, and 210.
  • the present disclosure provides a method 200 of file storage management in a management server 102, said management server 102 comprising one or more storage servers 106A to 106N configured to store files, the method comprising: configuring a catalog 108 to store a first list 110, of files 104A to 104N stored in one or more monitored servers amongst the storage servers 106A to 106N, the first list 110 recording, for each listed file, an identifier 112 for identifying the listed file, one or more storage locations 114A to 114N for identifying the one or more monitored servers and the one or more locations 116A to 116N on the identified monitored servers, where the listed file is stored, and a content information 118 related to the content of the listed file, detecting a change having occurred on a given file at one or more of the storage locations 114A to 114N recorded in the first list 110 for the given file, by determining the content information 118 related to the content of the changed given file and comparing it with the content information 118 recorded in the first list 110
  • the present disclosure provides the method 200 of file storage management in a management server 102, the management server 102 comprising one or more storage servers 106 A to 106N configured to store files.
  • the components of the management server 102 can be spread on different machines at different locations.
  • the management server 102 is used for managing the files 104A to 104N stored at the one or more storage servers 106A to 106.
  • the method 200 comprises, configuring the catalog 108 to store the first list 110, of files 104 A to 104N stored in one or more monitored servers amongst the storage servers 106A to 106N.
  • the first list 110 comprises recording, for each listed file, the identifier 112 for identifying the listed file, one or more storage locations 114A to 114N for identifying the one or more monitored servers and the one or more locations 116A to 116N on the identified monitored servers, where the listed file is stored.
  • the first list 110 further comprises recording, the content information 118 related to the content of the listed file.
  • the method 200 comprises, storing a duplicate file database, such as the first list 110 of the files 104A to 104N, by the catalog 108 of the management server 102, for managing the storage of the files 104A to 104N in one or more monitored servers.
  • the catalog 108 of the management server 102 comprises, managing the storage of the files 104A to 104N at the storage server 106A and the storage server 106B.
  • the first list 110 of the catalog 108 is configured to record (or store) the content information 118 related to the content of the listed file for managing each listed file 104A and 104N.
  • the content information 118 recorded by the first list 110 of the catalog 108 is used by the management server 102 for tracking each listed file 104A and 104N of the one or more storage server 106A to 106N. Thereafter, one or more storage locations 114A to 114N recorded by the first list 110 of the catalog 108 are used by the catalog 108 for identifying and tracking the monitored servers, such as the storage server 106 A and the storage server 106B. The one or more storage locations 114A to 114N recorded by the first list 110 are further used by the catalog 108 for identifying (and locating) the one or more locations 116A to 116N on the identified monitored servers, for determining the exact location of the listed file.
  • the storage location 114A recorded by the first list 110 is used by the catalog 108 for identifying the storage server 106A, and the location 116A is used for identifying (i.e., locating) the file 104A on the identified storage server 106A.
  • the storage location 114B recorded by the first list 110 is used by the catalog 108 for identifying the subsequent storage server, such as the storage server 106B and the location 116B of the subsequent files.
  • the catalog 108 comprises using the identifier 112 of the first list 110 (i.e., duplicate file database) for verifying the content information 118 and also for identifying that all the listed files are duplicates of the files 104A to 104N, which are stored at one or more location 116A to 116N.
  • the first list 110 of the catalog 108 records the identifier 112 for identifying the duplicates for each listed file of the first list 110 from the files 104A to 104N, which are stored at one or more monitored servers.
  • the identifier 112A recorded on the first list 110 is used for identifying that all the listed files are duplicates of the files 104A to 104N, which are stored at one or more monitored servers.
  • the identifier 112A recorded on the first list 110 is used for identifying that few of the listed files are duplicates of the files 104A to 104N, while the identifier 112B is recorded on the first list 110 for identifying that all the subsequent files are duplicates of the files 104A to 104N.
  • the method 200 further comprises, detecting a change having occurred on a given file at one or more of the storage locations 114A to 114N recorded in the first list 110 for the given file, by determining the content information 118 related to the content of the changed given file and comparing it with the content information 118 recorded in the first list 110 for the given file.
  • the first list 110 comprises, recording the content information 118 of the given file, such as the file 104A, which is same either within one storage location, such as the storage location 114A, or between one or more of the storage locations 114A to 114N.
  • the method 200 comprises, comparing the content of the changed given file 104A with the content information 118 recorded by the first list 110, to detect the change on the file 104A, such as using the change detection module 120 of FIG. 1A.
  • the catalog 108 (or global unstructured data catalog) is notified of changes on the given file (or a file system) very close to the actual change.
  • the change detection module 120 need not to compare the content information 118 at the subsequent storage location of the subsequent storage server. This is done as the first list 110 of the catalog 108 tracks changes to the given file, and stores down that there has not been a change since the last comparison.
  • the method 200 comprises improving the overall performance of the management server 102, and utilizing the resources efficiently.
  • the method 200 further comprises, selecting in the first list 110 one storage location for the given file amongst the storage locations 114A to 114N where the change has been detected for the given file.
  • the method 200 comprises, running a same operating system at one or more storage locations 114A to 114N of the one or more storage server 106A to 106N.
  • each storage server 106A to 106N has the given file, such as the file 104A (e.g., a system32 file).
  • the given file may be located in each of the storage server 106A to 106N.
  • the management server 102 comprises using the storage location selection module 122 for selecting one storage location from the first list 110 based on the detected change in the given file.
  • the storage location selection module 122 comprises selecting the storage location 114A on the storage server 106A as recorded by the first list 110, where the change detection module 120 detects the change in the given file.
  • the storage location selection module 122 is not limited to one storage server, such as the storage server 106A.
  • the storage location selection module 122 uses the first list 110 of the catalog 108 for determining the exact location of the changed given file.
  • the method 200 further comprises, determining whether the detected change corresponds to a malicious change by scanning the given file at the selected storage location.
  • the method 200 comprises, changing the content information 118 of the given file, such as the file 104A (e.g., a system32 file) by a virus, on one or more of the storage server 106A to 106N (or on several of the virtual machines).
  • the method 200 further comprises, detecting the change in the first list 110, such as by the change detection module 120. Thereafter, the scanning module 124 (or an antivirus) comprises scanning the changed given file at the selected storage location of the changed given file for determining, if the change detected by the change detection module 120 corresponds to a malicious change or not.
  • the scanning module 124 comprises scanning the changed file 104A at the storage location 114A, and determining if the change in the changed file 104A is a malicious change or not. Moreover, next time if any changed given file shows up, the management server 102 does not need to schedule the scanning module 124 (or an antivirus scan), since the catalog 108 of the management server 102 holds the content information 118 in its database that such file is an affected version of given file. Beneficially, as compared to the conventional approach, the scanning module 124 performs the effective scans (or antivirus scans) which do not scan a duplicate file more than once.
  • the method 200 further comprises, marking all the files 104A to 104N, stored at the storage locations 114A to 114N where the change has been detected for the given file, as valid if the detected change is not determined as being a malicious change, and as affected if the detected change is determined as being a malicious change.
  • the method 200 comprises, marking the files 104A to 104N of the first list 110 based on the results of the scanning module 124, such as using the marking module 126. For example, if the scanning module 124 comprises, detecting that the detected change in the file 104A stored at the location 116A is not the malicious change, then the marking module 126 comprises, marking the given file as valid (or a valid file).
  • the marking module 126 comprises marking the given file as affected (or affected file).
  • the catalog 108 saves the scan version, and an informed decision on which files need to be scanned can be made.
  • the catalog 108 creates a resource-effective organization-wide computer virus/malware protection framework and balances the storage server (or file system) scan between multiple storage servers 106A to 106N (or file systems), such that duplicate files 104A to 104N are only scanned once.
  • the method 200 further comprises selecting in the first list 110 one storage location comprising selecting in the first list 110 a storage server, amongst the monitored servers corresponding to the storage locations 114A to 114N where the change has been detected for the given file, that is the least used one, and selecting in the first list 110 one of the locations of the given file in the selected storage server.
  • the change detection module 120 comprises, detecting the change in the given file at one or more storage locations 114A to 114N of the first list 110. Therefore, the method 200 comprises, using one or more storage locations 114A to 114N of the first list 110 for selecting the least used storage server among the monitored servers, where the change has been detected by the change detection module 120 for the given file, such as by the storage location selection module 122.
  • one or more storage locations 114A to 114N of the first list 110 are further used by the storage location selection module 122 for selecting one of the locations 116A to 116N of the given file in the selected server, where the change has been detected for the given file.
  • the catalog 108 comprises using the scanning module 124 for scheduling an immediate scan of the changed file on the selected location of the least used storage server.
  • the catalog 108 comprises scheduling an immediate antivirus scan on a machine that is not regularly used, and on a copy of the file which resides on the storage server which is the file system least used.
  • scanning is the most efficient scanning since it is not affecting production or heavily used development environment, and not causing an unnecessary load on heavily used network file systems, such as the storage server other than the selected least used storage servers.
  • the method 200 further comprises selecting in the first list 110 a storage server comprising comparing the number of reads and/or writes performed on the monitored servers corresponding to the storage locations 114A to 114N where the change has been detected for the given file, in a given period of time, and selecting in the first list 110 one of the storage servers 106 A to 106N, corresponding to the storage locations 114A to 114N where the change has been detected for the given file, for which the number of reads and/or writes is the smallest.
  • the method 200 comprises, detecting a change on the given file in a given period of time on the monitored servers, such as using the change detection module 120.
  • the storage location selection module 122 uses one or more storage locations 114A to 114N of the first list 110 for identifying the monitored servers, where the change has been detected by the change detection module 120 in a given period of time for the given file. Thereafter, the storage location selection module 122 of the management server 102 comprises comparing the number of reads and/or writes performed in a given period of time on the monitored servers. Moreover, the storage location selection module 122 comprises using the storage locations 114A to 114N for selecting one of the monitored servers for which the number of reads and/or writes of the changed given file is the smallest, for identifying how the change started and which storage server was initially affected.
  • the method 200 comprising configuring the catalog 108 further comprises storing a second list 128 of files stored in one or more monitored servers amongst the storage servers that have been affected by a malicious change, the second list 128 recording, for each listed file, the identifier 112 for identifying the listed file, the one or more storage locations 114A to 114N for identifying the one or more monitored servers and the locations 116A to 116N on the corresponding monitored servers, where the listed file is stored, the content information 118 related to the content of the listed file.
  • the method 200 comprises, determining that the detected change corresponds to the malicious change at one or more monitored servers, such as using the scanning module 124 of FIG.1 A.
  • the catalog 108 stores a threat database, such as the second list 128 of the files 104A to 104N that have been affected by the malicious change, for managing the affected files.
  • the second list 128 of the catalog 108 further records (or store) the content information 118 related to the content of the listed file for managing the affected files 104A to 104N at one or more monitored servers.
  • the catalog 108 comprises using the one or more storage locations 114A to 114N recorded by the second list 128 for identifying and tracking the content information 118 of the monitored servers, such as the storage server 106 A and the storage server 106B.
  • the catalog 108 further comprises using the one or more storage locations 114A to 114N for identifying (and locating) the one or more locations 116A to 116N on the identified monitored servers, such as for determining the exact location of the affected files 104A to 104N. Thereafter, the catalog 108 uses the identifier 112 of the second list 128 (i.e., threat database) for verifying the content information 118 and also for verifying that all the listed files are duplicates of the files 104A to 104N, which are stored at one or more locations 116A to 116N of the monitored servers.
  • the second list 128 i.e., threat database
  • the method 200 further comprises, when determining that the detected change corresponds to a malicious change, recording in the second list 128 the identifier 112 of the given file, the one or more storage locations 114A to 114N of the given file where the change has been detected, and the content information related to the content of the changed given file.
  • the second list 128 (or threat database) comprises, holding the same information as in the first list 110 (or duplicate database) but with long retention only for the files 104A to 104N positively identified in the past as threats. Therefore, the method 200 comprises, recording the information related to the files 104A to 104N into the second list 128 in comparison to the first list 110, such as using the threat recording module 130.
  • the management server 102 uses the content information 118 recorded by the second list 128 of the catalog 108 for tracking each affected file 104 A and 104N at the one or more storage server 106A to 106N. Moreover, the content information 118 recorded by the second list 128 is used to save information about previously affected (or infected) files 104A to 104N, therefore removing the need to perform an actual scan.
  • the method 200 further comprises recording in the second list 128, for a given listed file, and for each of the storage location 114A to 114N of the given listed file, the time of the detected change determined as malicious.
  • the method 200 comprises, recording the time at which the malicious change is detected for a given listed file in the catalog 108, and also recording the time at which the malicious change is detected at each of the storage location 114A to 114N of the given listed file. Therefore, the method 200 comprises, identifying the initially affected file as well as the initially affected storage location, by using the recorded time of the detected malicious change.
  • the method 200 further comprises, for a given file listed in the second list 128, sorting the corresponding storage locations 114A to 114N by time of malicious change.
  • the method 200 comprises, sorting the storage location 114A to 114N of the given listed file by time of malicious change in the given listed file at each of the storage location 114A to 114N, such as using the sorting module 132.
  • the management server 102 of the method 200 comprises, identifying the first time of change (i.e., patient 0) for the given listed file.
  • the first list 110 further comprises, for each listed file, a protection mark for identifying the listed file as a protected or unprotected file, wherein the method 200 further comprises, after detecting a change, verifying whether the changed given file is marked as protected in the first list 110, and wherein selecting in the first list 110 one storage location and determining whether the detected change corresponds to a malicious change is performed only if the changed given file is marked as protected in the first list 110.
  • the method 200 comprises, monitoring changes, by the catalog 108, to critical files, such as operating files that should not change (unless a system upgrade is performed).
  • the protection mark is recorded by the catalog 108 in the first list 110 for each listed file for identifying the protected files and the unprotected files. Moreover, if the changed given file is marked as protected in the first list 110, then the change detection module 120 comprises triggering the storage location selection module 122 and the scanning module 124 for protecting the corresponding change file.
  • the present disclosure provides the method 200 which uses the catalog 108 for a resource effective protection from computer virus or malwares.
  • the catalog 108 includes the first list 110, which records different information related to files 104A to 104N, therefore the management server 102 is notified of changes on the storage servers 106A to 106N very close to the actual change.
  • the method 200 is able to distinguish between the malicious change and the normal change.
  • the method 200 further identifies the behavioral patterns leading to the installation of a computer virus/malware installation and avoid/wam of these patterns before actual infection. Therefore, method 200 performs the effective antivirus scans using the scanning module 124, which is not scanning the duplicate file more than once. In other words, the method 200 detects the computer virus/malware infection without actually performing an antivirus scan.
  • the method 200 balances the negative effects (which are present in the conventional approach) of antivirus scans using the scanning module 124 on the storage servers 106A to 106N.
  • FIG. 3 A is an illustration that depicts a catalog cooperation mechanism, in accordance with an embodiment of the present disclosure.
  • FIG. 3 A is described in conjunction with elements from FIG. 1 A.
  • an illustration 300A that depicts the catalog 108, one or more network attached storages systems, such as a network attached storage (NAS) system 302, a network attached storage system 304 and a network attached storage system 306, a duplicate database 308, a system32 file 310, and a hash value 312.
  • NAS network attached storage
  • One or more network attached storage systems 302, 304, and 306 corresponds to one or more storage servers such as storage servers 106 A to 106N of FIG. 1 A.
  • Each of the network attached storage system 302, 304, and 306 include suitable logic, circuitry, interfaces and/or code that is configured to store and manage the system32 file 310.
  • the duplicate database 308 corresponds to a first list such as the first list 110 of FIG.1 A.
  • the duplicate database 308 holds a list of the duplicate system32 file 310.
  • the duplicate database 308 holds information about the hash value 312, change information, file attributes (e.g. system file), and the like.
  • the duplicate database 308 stores information on the system32 file 310 that are the same either within a network attached storage, such as the network attached storage 302, or between one or more network attached storage systems 302, 304, and 306
  • the system32 files 310 corresponds to files such as the files 104A to 104N.
  • the system32 files 310 are executable file which are stored at one or more network attached storage systems 302, 304, and 306. In an example, each of the system32 files 310 may be associated with a virtual machine.
  • the system32 files 310 are used here in an example, but any other files (typically critical files) may be used.
  • the hash value 312 corresponds to an identifier such as the identifier 112 of FIG.1 A.
  • the hash value 312 is used to identify the listed file from the duplicate database 308.
  • content information represented as ‘XXXX’ in the FIG.3 A.
  • the representation ‘XXXX’ is only used for exemplary purpose.
  • FIG. 3A there are multiple network attached storage systems 302, 304, and 306 that are running an operating system in an organization, and each network attached storage system 302, 304, and 306 has a system file named system32 file 310.
  • the catalog 108 tracks changes in the network attached storage systems 302, 304, and 306.
  • the duplicate database 308 also records the hash value 312, and content information for each system32 file 310, as shown by "XXXX" in FIG. 3 A.
  • the hash value 312 and the duplicate database 308 are used by the catalog 108 to identify that all of the system32 files 310 of the duplicate database 308 are duplicates of the system32 files 310, which are stored atone or more of the network attached storage systems 302, 304, and 306.
  • the hash value 312 is used by the catalog 108 to identify the network attached storage systems 302, 304, and 306, and also to identify the location of the system32 file 310 on the identified network attached storage systems 302, 304, and 306, as shown in FIG.3 A by arrows sign. Therefore, the hash value 312 and the duplicate database 308 are used by the catalog 108 to manage the system32 files 310 of the network attached storage systems 302, 304, and 306.
  • FIG. 3B is an illustration that depicts a catalog cooperation mechanism, in accordance with another embodiment of the present disclosure.
  • FIG. 3B is described in conjunction with elements from FIGs. 1 A and 3 A.
  • the hash value 314 corresponds to the changed system32 file 310.
  • the duplicate database 308 records the hash value 312 of the given file, such as the system32 file 310. Thereafter, if the content information of the system32 file 310 recorded in the duplicate database 308 is changed, such as changed from "XXXX" to "YYYY” as shown in FIG. 3B. Then the new hash value 314 is used to identify the changed system32 file 310 on the network attached storage systems 302, 304, and 306. Moreover, the change detection module 120 of FIG.1A compares the content information of the changed system32 file 310 with the content information recorded by the duplicate database 308, such as the content information 118 of FIG. 1, so as to detect the change on the system32 file 310. For example, the content information "YYYY" of the changed system32 file 310 is compared with the content information "XXXX" to detect the change on the changed system32 file 310.
  • the catalog 108 is notified of changes on the file 104A (or a file system) very close to the actual change, as shown by dotted lines in FIG. 3B.
  • the new hash value 314 is used to identify the location of the changed system32 file 310 on each of the network attached storage system 302, 304, and 306.
  • the hash value 314 is used to represent the changed system32 files 310, which are located at each of the network attached storage system 302, 304, and 306.
  • the change detection module 120 need not compare the hash value 314 at the subsequent storage location of the subsequent network attached storage system 302, 304, and 306, Beneficially, as compared to the conventional approach, the overall performance of each network attached storage system 302, 304, and 306 is improved by virtue of the catalog 108.
  • FIG. 3C is an illustration that depicts a catalog cooperation mechanism, in accordance with yet another embodiment of the present disclosure.
  • FIG. 3C is described in conjunction with elements from FIGs. 1A, 3A, and 3B.
  • an antivirus 316 there is shown an antivirus 316.
  • the antivirus 316 corresponds to the scanning module 124 of FIG. 1.
  • a virus may have changed the content information of the given file, such as for the system32 file 310, the content information is changed from "XXXX" to "YYYY", as shown in FIG.3C.
  • the hash value 314 of the changed system32 file 310 is used by the catalog to identify the location of the changed system32 file 310 (shown by dotted lines) on each of the network attached storage systems 302, 304, and 306.
  • the catalog 108 starts actions to verify if this change is legitimate or if it is malicious.
  • the catalog 108 schedules an immediate antivirus scan using the antivirus 316 on a specific location of the network attached storage system that is not regularly used, and on a copy of the system32 file 310 which resides on the least used network attached storage system.
  • the catalog 108 schedules a virus scan using the antivirus 316 on the network attached storage system 302 (e.g. run scan on dev.ABC.not frequently used on respective directory), as shown in FIG. 3C by dotted lines.
  • a virus scan using the antivirus 316 on the network attached storage system 302 (e.g. run scan on dev.ABC.not frequently used on respective directory), as shown in FIG. 3C by dotted lines.
  • such scan is the most efficient scan since it does not affect production or heavily used development environment, and does not cause an unnecessary load on heavily used network attached storage systems, such as the network attached storage systems 304 and 306.
  • FIG. 3D is an illustration that depicts a catalog cooperation mechanism, in accordance with another embodiment of the present disclosure.
  • FIG. 3D is described in conjunction with elements from FIGs. 1 A, 3A, 3B, and 3C.
  • antivirus 316 detects a virus on the system32 file 310, which was already scanned by the antivirus 316, then at this stage, the catalog 108 already has the new hash value 314 of all the listed files that are affected.
  • the catalog 108 further has information related to an original version and time of change of each affected system32 file 310, a behavioral pattern leading to the change (e.g., file ⁇ /Downloads/I_am_a_malware.pdf added three seconds before the change), and the hash value 314 (or an identifier) of the affected system32 file 310. Moreover, based on the new hash value 314 of all the listed system32 files 310 that are affected, the catalog 108 can either proactively “fix” all the affected system32 files 310 if configured to do so or send a warning to the management server 102 with all the hash values that need to clean within the management server 102.
  • a behavioral pattern leading to the change e.g., file ⁇ /Downloads/I_am_a_malware.pdf added three seconds before the change
  • the hash value 314 or an identifier
  • such information is used to receive the information for the location (or virtual machine) of the network attached storage systems 302, 304, and 306, which was initial affected, and a start process of the change (or spread).
  • the catalog 108 identifies the affected system32 file 310 or a different file name with the same contents (e.g., ⁇ /Downloads/I_am_a_malware.pdf), then the catalog 108 take pre-emptive actions or warn that a virus is imminent.
  • the affected system32 files 310 shows up, the catalog 108 does not need to schedule an antivirus scan, since it already holds information in its database that this file is an affected version of the duplicate file of the duplicate database 308.
  • FIG. 3E is an illustration that depicts a catalog cooperation mechanism for reducing a load on network attached storage, in accordance with an embodiment of the present disclosure.
  • FIG. 3D is described in conjunction with elements from FIGs. 1A, 3A, 3B, 3C and 3D.
  • FIG.3E there is shown a plurality of pictures directory 318, 320, 322, 324, and 326.
  • the number of hash values, 328, 330, 332 and 324 corresponds to the identifier 112 of FIG. 1 A.
  • the pictures directories 318, 320, 322, 324, and 326 correspond to image files, which are stored on one or more of the network attached storage systems 302, 304 and 306.
  • the pictures directory 318, 320, and 322 are stored at the network attached storage system 302.
  • the pictures directory 324 are stored at the network attached storage system 304
  • the pictures directory 326 are stored at the network attached storage system 306.
  • the network attached storage systems 302, 304, and 306 run a number of operating systems on different virtual machines, such as the storage locations 114A to 114N.
  • the system32 files 310 stored at different virtual machines are same for each of the network attached storage system 302, 304 and 306, as the operating systems are also same for each of the network attached storage system 302, 304 and 306.
  • the content information 118 recorded by the duplicate database 308 is different for each of the network attached storage system 302, 304, and 306.
  • the pictures directories 318, 320, 322, 324, and 326 are different for each of the network attached storage system 302, 304, and 306 as every one of the users saves his/her pictures there.
  • the content information 118 recorded by the duplicate database 308 is scanned for each network attached storage system 302, 304, and 306, so that the load of the scan does not affect a single network attached storage system.
  • the hash values are used to identify the location of the pictures directories on each of the network attached storage system 302, 304, and 306, as shown in FIG. 3E.
  • the content information "YYYY” is scanned for the pictures directory 318 on the network attached storage system 302, which is identified using the hash value 328.
  • the content information "AAAA” is also scanned for the pictures directory 318 at the network attached storage system 302, which is identified using the corresponding hash value 330, and the like.
  • the compute required for the scan is much less than the compute which have been required in the conventional approach (i.e., if every conventional virtual machine scanned its own conventional files).
  • the system32 files 310 have already been scanned (e.g., pictures directory) on one of the storage server (or virtual machines)
  • the system32 files 310 have not changed since the last scan
  • such files are not needed to be scanned again, since the catalog 108 tracks changes to files, and notes down that there has not been a change since the last scan.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Virology (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A management server for managing the storage of files in one or more storage servers, comprises a catalog configured to store a first list of the files stored in one or more monitored servers amongst the storage servers. The first list comprises recording, for each listed file, an identifier, one or more storage locations, and a content information related to the content of the listed file. The management server further comprises a change detection module, a storage location selection, a scanning module, and a marking module. The management server is configured to detect a change on a given file, and select in the first list one storage location where a change has been detected for the given file. The management server performs effective antivirus scans which do not scan a duplicate file multiple times as in the conventional approach.

Description

MANAGEMENT SERVER AND METHOD FOR FILE STORAGE MANAGEMENT
TECHNICAL FIELD
The present disclosure relates generally to the field of storage management and more specifically, to a management server and a method for managing storage of files in storage servers.
BACKGROUND
Computer viruses and malwares are generally attached to a file (or a computer program), by unauthorised users. Such viruses and malwares can replicate and spread after an initial execution of the corresponding file on a computer system. Such viruses and malwares are harmful and can destroy critical files and computer data, which can thereby slow down the computer system. In general, avoiding the computer viruses and the malware requires substantial computational resources which are allocated for file system scans and for monitoring of changes to specific files. However, many times such resources are not used efficiently, for example a duplicate file is scanned several times on several systems, or on multiple copies on the same computer system. Further, such scans impose a load on network attached storage (NAS) servers which are coupled to various computer systems, where access to each file needs to be synchronized since the network attached storage server supports concurrent access. Therefore, synchronization is usually performed by locking parts of the file system. However, locking parts of the file system reduces an overall performance of the network attached storage server, and thus, there exists a technical problem of scanning several times the duplicate files on the same and different computer systems.
Conventionally, there are several methods to optimize scans on the network attached storage servers, such as a log based analysis, which is generally used to detect changes in critical computer system components such as event registry and syslog of the computer system. However, such optimized scans are limited to the network attached storage server itself, and not outside the network attached storage. Thus, duplicate files on different network attached storage servers may still be scanned multiple times by respective network attached storage servers which thereby impacts the performance. Another possible solution is based on antivirus scans for detecting footprints of computer viruses and malware on the computer systems. Generally, such antivirus scans are executed by coupling the network attached storage server with additional backup servers (or a computer), which performs antivirus scans. However, such backup servers can perform offline antivirus scans on changed files only and are limited to the backup servers only. Therefore, this solution does not solve the actual technical problem of scanning duplicate files in different computer systems on various network attached storage servers.
Therefore, in light of the foregoing discussion, there exists a need to overcome the aforementioned drawbacks associated with the conventional approaches of avoiding computer viruses and malware in the conventional scan approaches.
SUMMARY
The present disclosure provides a management server and a method for managing storage of files in storage servers. The present disclosure provides a solution to the existing problem of scanning of duplicate files, several times, on same and different computer systems. An objective of the present disclosure is to provide a solution that overcomes at least partially the problems encountered in the prior art and provides an improved system and a method for file storage management to avoid the existing problem of scanning of duplicate files, several times, on same and different computer systems.
One or more objectives of the present disclosure is achieved by the solutions provided in the enclosed independent claims. Advantageous implementations of the present disclosure are further defined in the dependent claims.
In one aspect, the present disclosure provides a management server for managing the storage of files in one or more storage server the management server comprising a catalog configured to store a first list of files stored in one or more monitored servers amongst the storage servers, the first list recording, for each listed file, an identifier for identifying the listed file, one or more storage locations for identifying the one or more monitored servers and the one or more locations on the identified monitored servers, where the listed file is stored, and a content information related to the content of the listed file, a change detection module configured to detect a change having occurred on a given file at one or more of the storage locations recorded in the first list for the given file, by determining the content information related to the content of the changed given file and comparing it with the content information recorded in the first list for the given file, a storage location selection module configured to select in the first list one storage location for a given file amongst the storage locations where a change has been detected for the given file, a scanning module configured to determine whether a detected change corresponds to a malicious change by scanning a given file for which a change is detected, at a selected storage location of the given file, a marking module to mark all the files, stored at the storage locations where a change has been detected for the given file, as valid if the detected change is not determined as being a malicious change, and as affected if the detected change is determined as being a malicious change.
The present disclosure provides a management server, which uses the catalog for a resource effective protection from computer virus or malwares. As the catalog includes a first list, which records different information related to files, therefore the management server is notified of changes on the storage servers very close to the actual change, as compared to the conventional approach. Moreover, the management server of the present disclosure can distinguish between a malicious change and a normal change. Therefore, the management server further identifies the behavioral patterns leading to installation of a computer virus/malware installation and avoids/wam of these patterns before actual infection. Moreover, the management server performs effective antivirus scans using the scanning module, which does not scan a duplicate file multiple times as in the conventional approach. In other words, the management server of the present disclosure may detect the computer virus/malware infection without actually performing an antivirus scan. As a result, the performance of the storage servers of the present disclosure is improved in comparison to storage servers used conventionally.
In an implementation form, the storage location selection module is configured to select in the first list a server, amongst the monitored servers corresponding to the storage locations where the change has been detected for a given file, that is the least used one, and selecting in the first list one of the locations of the given file in the selected server.
Beneficially, the management server causes no unnecessary load on heavily used storage servers by selecting the storage server which is least use one. In a further implementation form, the storage location selection module is configured to compare the number of reads and/or writes performed on the monitored servers corresponding to the storage locations where the change has been detected for the given file, in a given period of time, and to select in the first list one of the servers, corresponding to the storage locations where the change has been detected for the given file, for which the number of reads and/or writes is the smallest.
In this implementation, the management server does not cause an unnecessary load on heavily used storage servers. Moreover, the management server can identify how the change started and which storage server was initially affected.
In a further implementation form, the catalog is further configured to store a second list, of files stored in one or more monitored servers amongst the storage servers and that have been affected by a malicious change, the second list recording, for each listed file, the identifier for identifying the listed file, the one or more storage locations for identifying the one or more monitored servers and the locations on the corresponding monitored servers, where the listed file is stored, the content information related to the content of the listed file.
Beneficially, the second list is used to verify the content information of the affected files, and also to identify that all the listed affected files are duplicates of the files stored at one or more monitored servers.
In a further implementation form, the management server further comprising a threat recording module configured to record in the second list, when a detected change is determined as a malicious change, the identifier of the given file, the one or more storage locations of the given file where the change has been detected, and the content information related to the content of the changed given file.
The second list is used to verify the content information of the affected files, and also to identify that all the listed affected files are duplicates of the files stored at one or more monitored servers. Moreover, the content information recorded by the second list is used to save information about previously affected (or infected) files, therefore removing the need to perform an actual scan.
In a further implementation form, the catalog is further configured to record in the second list, for a given listed file, and for each of the storage location of the given listed file, the time of a detected change determined as malicious. Beneficially, the management server can identify the first time of change (i.e., patient 0) for the given listed file based on the time of detected change determined as malicious.
In a further implementation form, the management server further comprising a sorting module configured to sort, for a given file listed in the second list, the corresponding storage locations by time of malicious change.
By sorting the corresponding storage locations by the time of malicious change, the management server can identify the initially affected file as well as the initially affected storage location. The management server can also identify the first time of change (i.e., patient 0) for the given listed file.
In a further implementation form, the catalog is further configured to record in the first list, for each listed file, a protection mark for identifying the listed file as a protected or unprotected file, wherein the change detection module is further configured to determine if the changed given file is marked as protected in the first list and to trigger the storage selection module and the scanning module only if the changed given file is marked as protected in the first list.
By virtue of using a protection mask, the management server can perform effective protection of the resources, such as the storage servers. The protection mask is also used to monitor changes to critical files (e.g. operating files) that should not change unless a system upgrade is performed.
In another aspect, the present disclosure provides a method of file storage management in a management server, said management server comprising one or more storage servers configured to store files, the method comprising configuring a catalog to store a first list, of files stored in one or more monitored servers amongst the storage servers, the first list recording, for each listed file, an identifier for identifying the listed file, one or more storage locations for identifying the one or more monitored servers and one or more locations on the identified monitored servers, where the listed file is stored, and a content information related to the content of the listed file, detecting a change having occurred on a given file at one or more of the storage locations recorded in the first list for the given file, by determining the content information related to the content of the changed given file and comparing it with the content information recorded in the first list for the given file, selecting in the first list one storage location for the given file amongst the storage locations where the change has been detected for the given file, determining whether the detected change corresponds to a malicious change by scanning the given file at the selected storage location, marking all the files, stored at the storage locations where the change has been detected for the given file, as valid if the detected change is not determined as being a malicious change, and as affected if the detected change is determined as being a malicious change.
The disclosed method achieves all the technical effects of the management server of the present disclosure.
It is to be appreciated that all the aforementioned implementation forms can be combined. It has to be noted that all devices, elements, circuitry, units and means described in the present application could be implemented in the software or hardware elements or any kind of combination thereof. All steps which are performed by the various entities described in the present application as well as the functionalities described to be performed by the various entities are intended to mean that the respective entity is adapted to or configured to perform the respective steps and functionalities. Even if, in the following description of specific embodiments, a specific functionality or step to be performed by external entities is not reflected in the description of a specific detailed element of that entity which performs that specific step or functionality, it should be clear for a skilled person that these methods and functionalities can be implemented in respective software or hardware elements, or any kind of combination thereof. It will be appreciated that features of the present disclosure are susceptible to being combined in various combinations without departing from the scope of the present disclosure as defined by the appended claims.
Additional aspects, advantages, features and objects of the present disclosure would be made apparent from the drawings and the detailed description of the illustrative implementations construed in conjunction with the appended claims that follow.
BRIEF DESCRIPTION OF THE DRAWINGS
The summary above, as well as the following detailed description of illustrative embodiments, is better understood when read in conjunction with the appended drawings. For the purpose of illustrating the present disclosure, exemplary constructions of the disclosure are shown in the drawings. However, the present disclosure is not limited to specific methods and instrumentalities disclosed herein. Moreover, those in the art will understand that the drawings are not to scale. Wherever possible, like elements have been indicated by identical numbers.
Embodiments of the present disclosure will now be described, by way of example only, with reference to the following diagrams wherein:
FIG. 1 A is a block diagram of a management server for managing the storage of files in one or more storage server, in accordance with an embodiment of the present disclosure;
FIG. IB is a block diagram that illustrates various exemplary components of the management server, in accordance with an embodiment of the present disclosure;
FIG. 1C is a block diagram that illustrates various exemplary components of the storage server, in accordance with an embodiment of the present disclosure;
FIG. 2 is a flowchart of a method of file storage management in a management server, in accordance with an embodiment of the present disclosure; and
FIGs. 3A, 3B, 3C, 3D and 3E are illustrations that depict a catalog cooperation mechanism, in accordance with various embodiment of the present disclosure.
In the accompanying drawings, an underlined number is employed to represent an item over which the underlined number is positioned or an item to which the underlined number is adjacent. A non-underlined number relates to an item identified by a line linking the non- underlined number to the item. When a number is non-underlined and accompanied by an associated arrow, the non-underlined number is used to identify a general item at which the arrow is pointing.
DETAILED DESCRIPTION OF EMBODIMENTS
The following detailed description illustrates embodiments of the present disclosure and ways in which they can be implemented. Although some modes of carrying out the present disclosure have been disclosed, those skilled in the art would recognize that other embodiments for carrying out or practicing the present disclosure are also possible. FIG. 1 A is a block diagram of a management server for managing the storage of files in one or more storage server, in accordance with an embodiment of the present disclosure. With reference to FIG.1 A there is shown a block diagram 100A that comprises a management server 102, files 104A to 104N, one or more storage server 106A to 106N, a catalog 108, a first list 110, an identifier 112, one or more storage locations 114A to 114N, one or more locations 116A to 116N, and a content information 118. There is further shown, a change detection module 120, a storage location selection module 122, a scanning module 124, and a marking module 126. There is further shown a second list 128, a threat recording module 130, and a sorting module 132.
In one aspect, the present disclosure provides a management server 102 for managing the storage of files 104 A to 104N in one or more storage server 106 A to 106N the management server 102 comprising: a catalog 108 configured to store a first list 110 of files 104A to 104N stored in one or more monitored servers amongst the one or more storage servers 106 A to 106N, the first list 110 recording, for each listed file, an identifier 112 for identifying the listed file, one or more storage locations 114A to 114N for identifying the one or more monitored servers and one or more locations 116A to 116N on the identified monitored servers, where the listed file is stored, and a content information 118 related to the content of the listed file, a change detection module 120 configured to detect a change having occurred on a given file at one or more of the storage locations 114A to 114N recorded in the first list 110 for the given file, by determining the content information 118 related to the content of the changed given file and comparing it with the content information 118 recorded in the first list 110 for the given file, a storage location selection module 122 configured to select in the first list 110 one storage location for a given file amongst the one or more storage locations 114A to 114N where a change has been detected for the given file, a scanning module 124 configured to determine whether a detected change corresponds to a malicious change by scanning a given file for which a change is detected, at a selected storage location of the given file, a marking module 126 to mark all the files 104A to 106N, stored at the one or more locations 116A to 116N where a change has been detected for the given file, as valid if the detected change is not determined as being a malicious change, and as affected if the detected change is determined as being a malicious change.
The management server 102 include a suitable logic, circuitry, interfaces and/or code that is configured for managing the storage of files 104A to 104N in one or more storage server 106A to 106N. The management server 102 is further configured to monitor changes in the files 104A to 104N also referred to as critical files. Further, the management server 102 performs improved antivirus scans which do not scan duplicate files in the files 104A to 104N. In an example, the components of the management server 102 can be spread on different machines at different locations. Examples of the management server 102 includes but not limited to a management system, a server, computer server, and the like. The management server 102 is used to manage the files 104A to 104N stored at one or more storage servers 106A to 106, and also to provide an effective protection echo-system.
The files 104A to 104N corresponds to executable files which are stored at the one or more storage server 106A to 106N. In other words, the files 104A to 104N may correspond to files associated with one or more virtual or physical machines connected to the one or more storage server 106A to 106N. Examples of the files 104A to 104N includes but not limited to a system32 file, program files, and the like.
One or more storage server 106A to 106N include suitable logic, circuitry, interfaces and/or code that is configured to receive and store the files 104A to 104N associated with one or more virtual or physical machines. The one or more storage server 106A to 106N may store files 104A to 104N along with an identifier associated with the virtual or physical machines. Example of the one or more storage server 106A to 106N includes but not limited to a network attached storage (NAS) server. In an example, the one or more storage server 106A to 106N may be storage servers of a single organization.
The catalog 108 holds an organization-wide view of file system, even those residing on different storage systems, such as the one or more storage server 106A to 106N and holds information about the first list 110. In an example, the catalog 108 may also be referred to as a global unstructured data catalog. The first list 110 holds a list of files 104A to 104N and information associated with such files 104A to 104N. In an example, the first list 110 may also be referred to as a duplicate list, duplicate database, and the like. The first list 110 holds information about file contents (e.g., a strong hash), change information, file attributes (e.g. system file), and the like. In an implementation, the first list 110 stores information of the files 104A to 104N that are the same (i.e. duplicate) either within a storage server, such as the storage server 106A or between one or more storage server 106 A to 106N.
The identifier 112 is used to identify the listed file from the first list 110. In an example, the identifier 112 may also be referred to as a hash key. The identifier 112A further includes a plurality of identifiers, such as identifier 112A to 112N.
The one or more storage locations 114A to 114N corresponds to locations that are stored for identifying the one or more monitored servers amongst the storage server 106A to 106N. The one or more locations 116A to 116N corresponds to locations that are stored for identifying the listed files in the identified monitored servers. In an example, the one or more locations 116A to 116N may correspond to an address of a virtual machine or files of virtual machines within the one or more storage server 106A to 106N.
The content information 118 corresponds to information associated with the content of the files 104A to 104N, such as a text, picture, video, data, and the like that describes the content of the files 104A to 104N. In an example, the content information 118 includes information of all the files 104A to 104N that are affected, a hash (or an identifier) of the affected files 104A to 104N, an original version of the files 104A to 104N, and time of change of each file, and a behavioral pattern that is leading to a change of the files 104A to 104N.
The change detection module 120 is a software module that is used for monitoring changes in specific files 104A to 104N, such as operating system files that should not change unless a system upgrade is performed. Such specific files may also be referred to as critical files. The change detection module 120 may also be implemented as a circuit in the management server
102
The storage location selection module 122 is a software module that is used for the selection of the one or more storage locations 114A to 114N in the first list 110 for a given file, where a change has been detected for the given file by the change detection module 120. The storage location selection module 122 may also be implemented as a circuit in the management server
102
The scanning module 124 is a software module used to perform effective antivirus scans which do not scan a duplicate file more than once. In an example, the scanning module 124 corresponds to an antivirus that is used for detection of computer virus/malware infection without actually performing the antivirus scan. The scanning module 124 may be also implemented as a circuit in the management server 102.
The marking module 126 is a software module that is used to mark the changed files 104A to 104N based on the detected change (e.g., malicious change or normal change) in the files 104A to 104N. In other words, the marking module 126 is used to mark the changed files 104A to 104N which are having virus or malware. The marking module 126 may also be implemented as a circuit in the management server 102.
The second list 128 holds the same information as in the first list 110 (or duplicate database) but with a long retention only for files 104A to 104N positively identified in the past as threats. The second list 128 may also be referred to as a threat list.
The threat recording module 130 is a software module that is used to record the information related to the files 104A to 104N which are positively identified in the past as threats. The sorting module 132 is a software module that is used to sort the storage locations 114A to 114N based on the time of malicious change in the files 104A to 104N. Each of the threat recording module 130 and the sorting module 132 may be implemented as a circuit in the management server 102.
In operation, the catalog 108 is configured to store the first list 110 of the files 104A to 104N stored in one or more monitored servers amongst the storage servers 106A to 106N. The first list 110 comprises recording, for each listed file, the identifier 112 for identifying the listed file, the one or more storage locations 114A to 114N for identifying the one or more monitored servers and the one or more locations 116A to 116N on the identified monitored servers, where the listed file is stored. The first list 110 further comprises recording, the content information 118 related to the content of the listed file. In other words, the catalog 108 of the management server 102 stores a duplicate file database, such as the first list 110 of the files 104A to 104N for managing the storage of the files 104A to 104N in one or more monitored servers. For example, the catalog 108 of the management server 102 manages the storage of the files 104A to 104N at the storage server 106A and the storage server 106B. Moreover, for each listed files, the first list 110 of the catalog 108 is configured to record (or store) the content information 118 related to the content of the listed file so as to manage each listed file 104A and 104N. In an example, the content information 118 recorded by the first list 110 of the catalog 108 is used by the management server 102 to track each listed file 104 A and 104N of the one or more storage server 106A to 106N. Thereafter, one or more storage locations 114A to 114N recorded by the first list 110 of the catalog 108 are used by the catalog 108 to identify and track the monitored servers, such as the storage server 106 A and the storage server 106B. The one or more storage locations 114A to 114N recorded by the first list 110 are further used by the catalog 108 to identify (and locate) the one or more locations 116A to 116N on the identified monitored servers, so as to determine the exact location of the listed file. For example, the storage location 114A recorded by the first list 110 is used by the catalog 108 to identify the storage server 106A and the location 116A is used to identify (i.e., locate) the file 104A on the identified storage server 106A. Similarly, the storage location 114B recorded by the first list 110 is used by the catalog 108 to identify the subsequent storage server, such as the storage server 106B and the location 116B of the subsequent files. Thereafter, the catalog 108 uses the identifier 112 of the first list 110 (i.e., duplicate file database) to verify the content information 118 and also to identify that all the listed files are duplicates of the files 104A to 104N, which are stored at one or more location 116A to 116N. In other words, the first list 110 of the catalog 108 records the identifier 112 to identify the duplicates for each listed file of the first list 110 from the files 104A to 104N, which are stored at one or more monitored servers. In an example, the identifier 112A recorded on the first list 110 is used to identify that all the listed files are duplicates of the files 104A to 104N, which are stored at one or more monitored servers. In another example, the identifier 112A recorded on the first list 110 is used to identify that few of the listed files are duplicates of the files 104A to 104N, while the identifier 112B is recorded on the first list 110 to identify that all the subsequent files are duplicates of the files 104A to 104N.
In an implementation, many storage systems have a duplicate database (either in a file-level or a block-level), but the catalog 108 has a organization-wide database with the first list 110, which allows both organization-wide detections, and effective selection of the files 104A to 104N for scanning (e.g., scans in a conventional network file system can cause a disturbance in conventional network file system operations). Moreover, the content information 118 recorded by the first list 110 is used to save information about previously infected files 104A to 104N, therefore removing the need to perform an actual scan.
The management server 102 further comprises the change detection module 120 configured to detect a change having occurred on a given file at one or more of the storage locations 114A to 114N recorded in the first list 110 for the given file, by determining the content information 118 related to the content of the changed given file and comparing it with the content information 118 recorded in the first list 110 for the given file. As the first list 110 records the content information 118 of the given file, such as the file 104A, which is the same either within one storage location, such as the storage location 114A, or between one or more of the storage locations 114A to 114N. Therefore, the change detection module 120 compares the content of the changed given file 104A with the content information 118 recorded by the first list 110, so as to detect the change on the file 104A. In an example, the catalog 108 (or global unstructured data catalog) is notified of changes on the given file (or a file system) very close to the actual change. Moreover, if the content information 118 related to the given file is not changed since the last comparison, then in such a case, the change detection module 120 need not compare the content information 118 at the subsequent storage location of the subsequent storage server. This is done as the first list 110 of the catalog 108 tracks changes to the given file, and stores that there has not been a change since the last comparison. Beneficially, as compared to the conventional approach, overall performance is improved and resources are efficiently utilized in the present disclosure.
The management server 102 further comprises the storage location selection module 122 configured to select in the first list 110 one storage location for a given file amongst the storage locations 114A to 114N where a change has been detected for the given file. In an implementation, one or more storage servers 106A to 106N are running with a same operating system at one or more storage locations 114A to 114N. Moreover, each storage server 106A to 106N has the given file, such as the file 104A (e.g., a system32 file). In other words, the given file may be located in each of the storage server 106A to 106N. Therefore, if the change in the content information 118 related to the given file is detected by the change detection module 120, then, in such a case, the storage location selection module 122 of the management server 102 is used to select one storage location from the first list 110 based on the detected change in the given file. For example, the storage location selection module 122 selects the storage location 114A on the storage server 106A as recorded by the first list 110, where the change detection module 120 detects the change in the given file. Beneficially, as compared to the conventional approach, the storage location selection module 122 is not limited to one storage server, such as the storage server 106A. Further, the first list 110 of the catalog 108 is used by the storage location selection module 122 to determine the exact location of the changed given file.
The management server 102 further comprises the scanning module 124 configured to determine whether a detected change corresponds to a malicious change by scanning a given file for which a change is detected, at a selected storage location of the given file. In an example, a virus has changed the content information 118 of the given file, such as the file 104A (e.g., a system32 file) on one or more of the storage server 106A to 106N (or on several of the virtual machines). Moreover, the change is detected by the change detection module 120 in the first list 110. Thereafter, the scanning module 124 (or an antivirus) scans the changed given file at the selected storage location of the changed given file, to determine if the change detected by the change detection module 120 corresponds to a malicious change or not. For example, the scanning module 124 scan the changed file 104A at the storage location 114A, and determine if the change in the changed file 104A is a malicious change or not. Moreover, next time if any changed given file shows up, the management server 102 does not need to schedule the scanning module 124 (or an antivirus scan), since the catalog 108 of the management server 102 already holds the content information 118 in its database that such file is an affected version of the given file. Beneficially, as compared to the conventional approach, the scanning module 124 performs the effective scans (or antivirus scans) which do not scan a duplicate file more than once.
In an implementation, a user of the management server 102 specifies an extent of the scanning module 124 (e.g., scope of virtual machine/ entire storage servers 106A to 106N/ entire organization) on the storage servers 106A to 106N (or CPU resources) on which the scan can run. The user further requests load balancing (or an automatic load balancing) on the monitored servers, such as 20% load balancing on the storage server 106A, 40% load balancing on the storage server 106B, and 40% load balancing on the subsequent storage server. Thereafter, an initial map is built to scan the files stored at the storage servers 106A to 106N (or the files that have yet to be scanned). Moreover, a portion of the scan is scheduled to run on the monitored servers (or designated CPU resources), and upon completion of that portion of the scan, another portion of the scan is scheduled. As a result, each file is marked as scanned (e.g., with a version used for the scan). In an example, the initial map may change in real-time as few of the storage servers 106A to 106N may be busier than others. For example, if in case the storage server 106A and the storage server 106B are busier as compared to subsequent servers, then, the scanning process of the scanning module 124 will be slower on the storage server 106A and the storage server 106B as compared to subsequent storage servers. Moreover, when all the files 104A to 104N have been scanned, the user or a processor of the management server 102 is notified of the results of the scanning module 124. In an example, a patch is applied to the scanning module 124 (e.g., antivirus getting updated with virus/malware information and a rescan is needed). Moreover, with an application program interface between the catalog 108 and the scanning module 124 (or an antivirus), the catalog 108 balance a file system scan between multiple file systems, such as the storage servers 106A to 106N, such that duplicate files in the files 104A to 104N are only scanned once.
The management server 102 further comprises the marking module 126 to mark all the files 104A to 104N, stored at the locations 116A to 116N where a change has been detected for the given file, as valid if the detected change is not determined as being a malicious change, and as affected if the detected change is determined as being a malicious change. In other words, the marking module 126 is used to mark all the files 104A to 104N of the first list 110 based on the results of the scanning module 124. For example, if the scanning module 124 detects that the detected change in the file 104A stored at the location 116A is not the malicious change, then the marking module 126 marks the given file as valid (or a valid file). However, if the scanning module 124 detects that the detected change in the given file stored at the location 116A is not the malicious change, then the marking module 126 marks the given file as affected (or affected file). Beneficially, as compared to the conventional approach, the catalog 108 saves the scan version, and an informed decision on which files need to be scanned can be made. Moreover, the catalog 108 creates a resource-effective company-wide (or organization-wide) computer virus/malware protection framework. Further, the catalog 108 balances a file system scan between multiple file systems such that duplicate files are only scanned once.
In an implementation, one or more of the storage servers 106A to 106N run a number of operating systems at one or more of the storage locations 114A to 114N (or on different virtual machines). Moreover, the files 104A to 104N stored at one or more of the storage locations 114A to 114N are the same for each of the storage server 106 A to 106N, as the operating systems are also the same for each of the storage server 106A to 106N. However, the content information 118 recorded by the first list 110 is different for each of the storage server 106 A to 106N. For example, the pictures directory is different for each of the storage server 106A to 106N as every one of the users saves his/her pictures there, so the content information 118 is different for each storage server 106A to 106N. Therefore, in such a case, the content information 118 recorded by the first list 110 is scanned for each storage server 106 A to 106N, so that the load of the scan does not affect a single storage system. Beneficially, as compared to the conventional approach, a compute (or computation) required for the scan is much less than a compute which is required in the conventional approach (i.e., if every conventional virtual machine scanned its own conventional files). In an example, if few of the files 104A to 104N have already been scanned (e.g., pictures directory) on one of the storage servers (or virtual machines), and that files has not changed since the last scan, then such files do not need to be scanned again, since the catalog 108 tracks changes to files, and note down that there has not been a change since the last scan. As a result, the present disclosure is very resource efficient in comparison to the conventional approaches.
In accordance with an embodiment, the storage location selection module 122 is configured to select in the first list 110 a storage server, amongst the monitored servers corresponding to the storage locations 114A to 114N where the change has been detected for a given file, that is the least used one, and selecting in the first list 110 one of the locations 116A to 116N of the given file in the selected server. As the change detection module 120 detects a change in the given file at one or more storage locations 114A to 114N of the first list 110. Therefore, one or more storage locations 114A to 114N of the first list 110 are used by the storage location selection module 122 to select the least used storage server among the monitored servers, such as the storage server 106A, where the change has been detected by the change detection module 120 for the given file. Moreover, one or more storage locations 114A to 114N of the first list 110 are further used by the storage location selection module 122 to select one of the locations 116A to 116N of the given file in the selected server, where the change has been detected for the given file. For example, the storage location 114A of the first list 110 is used by the storage location selection module 122 to select the location 116A of the file 104A in the storage server 106A, where the change has been detected for the file 104A. In an implementation, the scanning module 124 is used by the catalog 108 to schedule an immediate scan of the changed file on the selected location of the least used storage server. In other words, the catalog 108 schedules an immediate antivirus scan on a machine that is not regularly used, and on a copy of the file which resides on the storage server which is the file system least used. For example, the catalog 108 schedules a virus scan using the scanning module 124 on a specific virtual machine, and on a specific location, (e.g. run scan on dev.ABC.not frequently used on respective directory such as C:\Windows\System32). Beneficially in comparison to the conventional approach, such scan is probably the most efficient scan possible since it does not affect production or heavily used development environment, and will not cause an unnecessary load on heavily used network file systems, such as the storage server other than the selected least used storage servers.
In accordance with an embodiment, the storage location selection module 122 is configured to compare the number of reads and/or writes performed on the monitored servers corresponding to the storage locations 114A to 114N where the change has been detected for the given file, in a given period of time, and to select in the first list 110 one of the servers, corresponding to the storage locations 114A to 114N where the change has been detected for the given file, for which the number of reads and/or writes is the smallest. In an example, the change detection module 120 detects a change on the given file in a given period of time on the monitored servers. Thereafter, one or more storage locations 114A to 114N of the first list 110 are used by the storage location selection module 122 to identify the monitored servers, where the change has been detected by the change detection module 120 in a given time period for the given file. For example, the storage locations 114A and 114B of the first list 110 are used by the storage location selection module 122 to identify the storage server 106A and the storage server 106B, where the change has been detected. Thereafter, the storage location selection module 122 of the management server 102 compares the number of reads and/or writes performed in a given time period on the monitored servers. Moreover, the storage locations 114A to 114N are used by the storage location selection module 122 to select one of the monitored servers for which the number of reads and/or writes of the changed given file is the smallest, to identify how the change started and which storage server was initially affected. For example, if the number of reads and/or writes of the changed file 104A is small at the storage server 106A as compared to the storage server 106B, then the storage location selection module 122 selects the storage server 106A, which means that the file 104A was initially changed at the storage server 106A. In other words, the smallest number of reads and/or writes are used by the storage location selection module 122 to identify which virtual machine was the initial machine affected, and a process of the spread of the virus or malware. In other words, file access time allows a spread timeline to be created making a post-mortem analysis of the spread of the virus or malware, which is used to identify flaws in network security. Further, the spread timeline is used to locate the first system to be affected (i.e. patient zero), which is beneficial to find the initial infection which is usually different than the method of spreading through the network.
In accordance with an embodiment, the catalog 108 is further configured to store a second list 128, of files 104A to 104N stored in one or more monitored servers amongst the storage servers 106A to 106N and that have been affected by a malicious change. The second list recording, for each listed file, the identifier 112 for identifying the listed file, the one or more storage locations 114A to 114N for identifying the one or more monitored servers and the locations on the corresponding monitored servers, where the listed file is stored, the content information 118 related to the content of the listed file. In an implementation, the scanning module 124 determines that the detected change corresponds to the malicious change at one or more monitored servers. Thereafter, the catalog 108 stores a threat database, such as the second list 128 of the files 104A to 104N that have been affected by the malicious change, so as to manage the affected files. The second list 128 of the catalog 108 further records (or stores) the content information 118 related to the content of the listed file so as to manage the affected files 104A to 104N at one or more monitored servers. Further, one or more storage locations 114A to 114N recorded by the second list 128 are used by the catalog 108 to identify and track the content information 118 of the monitored servers, such as the storage server 106 A and the storage server 106B. The one or more storage locations 114A to 114N are further used by the catalog 108 to identify (and locate) the one or more locations 116A to 116N on the identified monitored servers, so as to determine the exact location of the affected files 104A to 104N. Thereafter, the catalog 108 uses the identifier 112 of the second list 128 (i.e., threat database) to verify the content information 118 and also to identify that all the listed files are duplicates of the files 104A to 104N, which are stored at one or more locations 116A to 116N of the monitored servers. In other words, the second list 128 of the catalog 108 records the identifier 112 to identify the duplicates for each listed file of the second list 128 from the files 104A to 104N stored at one or more storage server 106 A to 106N.
In accordance with an embodiment, the management server 102 further comprises a threat recording module 130 configured to record in the second list 128, when a detected change is determined as a malicious change, the identifier of the given file, the one or more storage locations 114A to 114N of the given file where the change has been detected, and the content information 118 related to the content of the changed given file. As the second list 128 (or threat database) holds the same information as in the first list 110 (or duplicate database) but with a long retention only for the files 104A to 104N positively identified in the past as threats. Therefore, the threat recording module 130 is used by the management server 102 to record a different information into the second list 128 in comparison to the first list 110. The information may include the time of change, the identifier 112 (or a hash value), the content information 118, and one or more storage locations 114A to 114N of the given file where the change has been detected. In an example, the content information 118 recorded by the second list 128 of the catalog 108 is used by the management server 102 to track each affected file 104A and 104N at the one or more storage server 106A to 106N. Moreover, the content information 118 recorded by the second list 128 is used to save information about previously affected (or infected) files 104A to 104N, therefore removing the need to perform an actual scan.
In other words, in this embodiment, if antivirus detects a virus on a system32 file it has scanned, then at this stage, the catalog 108 of the management server 102 already has the content information 118 of all the listed files that are affected. The catalog 108 further has information related to an original version and time of change of each affected file, a behavioral pattern leading to the change (e.g., file ~/Downloads/I_am_a_malware.pdf added three seconds before the change), and an identifier (or a hash) of the affected file. Moreover, based on the content information 118 of the all the listed files 104A to 104N that are affected, the management server 102 can either proactively “fix” all the affected files 104A to 104N if configured to do so or send a warning to a processor of the management server 102 with all the content information 118. Beneficially, in comparison with the conventional approach, such information is used by the storage location selection module 122 of the management server 102 to receive the information for the location (or virtual machine) of the storage server, which was initially affected, and a process of start of the change (or spread) (e.g. download and open of a specific type of file). In addition, next time if the catalog 108 identifies the affected file or a different file name with the same contents (e.g., ~/Downloads/I_am_a_malware.pdf), then the catalog 108 takes actions or warns that a virus may be present. Moreover, next time the affected files show up, it does not need to schedule an antivirus scan, since it already holds information in its database that this file is an affected version of the duplicate file of the first list 110.
In accordance with an embodiment, the catalog 108 is further configured to record in the second list 128, for a given listed file, and for each of the storage location 114A to 114N of the given listed file, the time of a detected change determined as malicious. In other words, the catalog 108 records the time at which the malicious change is detected for a given listed file and also records the time at which the malicious change is detected at each of the storage locations 114A to 114N of the given listed file. Moreover, recording the time of the detected malicious change is used by the catalog 108 to identify the initially affected file as well as the initially affected storage location. In accordance with an embodiment, the management server 102 further comprises a sorting module 132 configured to sort, for a given file listed in the second list 128, the corresponding storage locations 114A to 114N by time of malicious change. In other words, the sorting module 132 is able to sort the storage location 114A to 114N of the given listed file by time of malicious change in the given listed file at each of the storage location 114A to 114N. Beneficially, as compared to the conventional approach, the management server 102 identifies the first time of change (i.e., patient 0) for the given listed file.
In accordance with an embodiment, the catalog 108 is further configured to record in the first list 110, for each listed file, a protection mark for identifying the listed file as a protected or unprotected file, wherein the change detection module 120 is further configured to determine if the changed given file is marked as protected in the first list 110 and to trigger the storage location selection module 122 and the scanning module 124 only if the changed given file is marked as protected in the first list 110. In an example, the catalog 108 monitors changes to critical files, such as operating files that should not change (unless a system upgrade is performed). Therefore, for such files, the protection mark is recorded by the catalog 108 in the first list 110 for each listed file to identify the protected and unprotected files. Moreover, if the changed given file is marked as protected in the first list 110, then the change detection module 120 triggers the storage location selection module 122 and the scanning module 124, to protect the corresponding changed file. In an example, if the file is not supposed to change, the catalog 108 starts actions to verify if this change is legitimate or if it is malicious.
In other words, in this embodiment, for protected files, the only action required is marking files that are not prone to change, such as system files. This can be done explicitly in the catalog 108 code, and built over time using an artificial intelligence system that learns files 104A to 104N that are affected by the different computer virus/malware. In an example, the following steps are used by the catalog 108 for the protection triggered by the change of file as a normal update of the catalog 108:
1. Check if the listed file is a file marked for protection
2. File is marked for protection and check if it is in threat database (or the second list 128) 2.1. If it is in the threat database, go to step 6 - the infected system is “patient 0”
3. Schedule virus scan (or execute the scanning module) to run on the copy of the file on the file system receiving the least load. 4. Get results from antivirus:
4.1. Not infected - mark change as valid on all copies of the file and finish
4.2. Infected - mark all copies as infected, and add the file information to the thread database
5. Sort all the duplicate infected files by time of change and identify the first to change (patient 0)
6. Identify all changes to “patient 0” prior to the actual infection
7. Notify system administrator of all the infection information including “patient 0”
8. If system is configured to proactively fix the system, affected files can be restored to their previous values, and any additional system clean ups can be done automatically (e.g. some viruses can be disabled by creating a file with a certain name, or possibly some additional files may need to be removed).
Beneficially, as compared to the conventional approach, the protection mask is used by the catalog 108 of the management server 102 to perform effective protection of the organization resources, such as the storage servers 106A to 106N. The protection mask is also used to monitor changes in critical files (e.g. operating files) that should not change unless a system upgrade is performed.
Therefore, the present disclosure provides a management server 102, which uses the catalog 108 for a resource effective protection from computer virus or malwares. As the catalog 108 includes the first list 110, which records different information related to files 104A to 104N, therefore the management server 102 is notified of changes on the storage servers 106A to 106N very close to the actual change. Moreover, the management server 102 can distinguish between the malicious change and the normal change. The management server 102 further identifies the behavioral patterns leading to the installation of a computer virus/malware installation and avoids/wam of these patterns before actual infection. Therefore, the management server 102 performs effective antivirus scans using the scanning module 124, which does not scan a duplicate file more than once. In other words, the management server 102 detects the computer virus/malware infection without actually performing an antivirus scan. Moreover, the management server 102 balances the negative effects (which are present in the conventional approach) of antivirus scans using the scanning module 124 on the storage servers 106A to 106N. FIG. IB is a block diagram that illustrates various exemplary components of the management server, in accordance with an embodiment of the present disclosure. With reference to FIG. IB, there is shown a block diagram 100B that illustrates the management server 102. The management server 102 includes a first processor 134, a first transceiver 136, and a first memory 138. The first processor 134 may be communicatively coupled to the first transceiver 136 and the first memory 138. The management server 102 is coupled to the storage server 106A via the communication network 140. There is further shown the change detection module 120, the storage location selection module 122, the scanning module 124, and the marking module 126. Optionally, in some implementation, the threat recording module 130 and the sorting module 132 may also be provided. All such modules may be communicatively coupled to each other and to the first processor 134 of the management server 102.
The first processor 134 is configured to manage the files 104A to 104N stored at the trusted servers 104. The first processor 134 may perform all the functionalities of the management server 102. In an example, the first processor 134 may be a general-purpose processor. Other examples of the first processor 134 may include, but is not limited to a microprocessor, a microcontroller, a complex instruction set computing (CISC) processor, an application-specific integrated circuit (ASIC) processor, a reduced instruction set (RISC) processor, a very long instruction word (VLIW) processor, a central processing unit (CPU), a state machine, a data processing unit, and other processors or control circuitry. Moreover, the first processor 134 may refer to one or more individual processors, processing devices, a processing unit that is part of a machine, such as the management server 102.
The first transceiver 136 includes suitable logic, circuitry, and interfaces that may be configured to communicate with one or more external devices, such as the server 106A. Examples of the first transceiver 136 may include, but is not limited to, an antenna, a telematics unit, a radio frequency (RF) transceiver, one or more amplifiers, one or more oscillators, a digital signal processor, a coder-decoder (CODEC) chipset, and/or a subscriber identity module (SIM) card.
The first memory 138 refers to a primary storage system of the management server 102. The first memory 138 includes suitable logic, circuitry, and interfaces that may be configured to store the catalog 108. Examples of implementation of the first memory 138 may include, but are not limited to, Electrically Erasable Programmable Read-Only Memory (EEPROM), Random Access Memory (RAM), Read Only Memory (ROM), Hard Disk Drive (HDD), Flash memory, Solid-State Drive (SSD), and/or CPU cache memory. The first memory 138 may store an operating system and/or other program products (including one or more operation algorithms) to operate the management server 102.
In operation, the first processor 134 of the management server 102 is configured for managing the storage of files 104 A to 104N in one or more storage server 106 A to 106N. The first processor 134 is further configured to store, into the catalog 108, the first list 110 of the files 104 A to 104N stored in one or more monitored servers amongst the storage servers 106 A to
106N.
FIG. 1C is a block diagram that illustrates various exemplary components of the storage server, in accordance with an embodiment of the present disclosure. With reference to FIG. 1C, there is shown a block diagram lOOC of the storage server 106A amongst the one or more storage servers 106A to 106N. The storage server 106A further includes a second processor 142, a second transceiver 144, and a second memory 146. The second processor 142 may be communicatively coupled to the second transceiver 144 and the second memory 146. The storage server 106A is communicatively coupled to the management server 102 via the communication network 140.
The second processor 142 is configured to execute the files 104A to 104N stored in the second memory 146. In an example, the second processor 142 may be a general-purpose processor. Other examples of the second processor 142 may include, but is not limited to a microprocessor, a microcontroller, a complex instruction set computing (CISC) processor, an application- specific integrated circuit (ASIC) processor, a reduced instruction set (RISC) processor, a very long instruction word (VLIW) processor, a central processing unit (CPU), a state machine, a data processing unit, and other processors or control circuitry. The second processor 142 of the storage server 106A is configured to store the files 104A to 104N in the second memory 146 of the storage server 106A.
The second transceiver 144 includes suitable logic, circuitry, and interfaces that may be configured to communicate with one or more external devices, such as the management server 102. Examples of the second transceiver 144 may include, but is not limited to, an antenna, a telematics unit, a radio frequency (RF) transceiver, one or more amplifiers, one or more oscillators, a digital signal processor, a coder-decoder (CODEC) chipset, and/or a subscriber identity module (SIM) card. The second memory 146 refers to a primary storage system of the storage server 106A. Examples of implementation of the second memory 146 may include, but are not limited to, Electrically Erasable Programmable Read-Only Memory (EEPROM), Random Access Memory (RAM), Read Only Memory (ROM), Hard Disk Drive (HDD), Flash memory, Solid- State Drive (SSD), and/or CPU cache memory. The second memory 146 may store an operating system and/or other program products (including one or more operation algorithms) to operate the storage server 106A.
FIG. 2 is a flowchart of a method of file storage management in a management server, in accordance with an embodiment of the present disclosure. FIG. 2 is described in conjunction with elements from FIG. 1 A. With reference to FIG.2, there is shown a method 200. The method 200 includes steps 202, 204, 206, 208, and 210.
In another aspect the present disclosure provides a method 200 of file storage management in a management server 102, said management server 102 comprising one or more storage servers 106A to 106N configured to store files, the method comprising: configuring a catalog 108 to store a first list 110, of files 104A to 104N stored in one or more monitored servers amongst the storage servers 106A to 106N, the first list 110 recording, for each listed file, an identifier 112 for identifying the listed file, one or more storage locations 114A to 114N for identifying the one or more monitored servers and the one or more locations 116A to 116N on the identified monitored servers, where the listed file is stored, and a content information 118 related to the content of the listed file, detecting a change having occurred on a given file at one or more of the storage locations 114A to 114N recorded in the first list 110 for the given file, by determining the content information 118 related to the content of the changed given file and comparing it with the content information 118 recorded in the first list 110 for the given file, selecting in the first list 110 one storage location for the given file amongst one or more of the storage locations 114A to 114N where the change has been detected for the given file, determining whether the detected change corresponds to a malicious change by scanning the given file at the selected storage location, marking all the files 104A to 104N, stored at the storage locations 114A to 114N where the change has been detected for the given file, as valid if the detected change is not determined as being a malicious change, and as affected if the detected change is determined as being a malicious change.
The present disclosure provides the method 200 of file storage management in a management server 102, the management server 102 comprising one or more storage servers 106 A to 106N configured to store files. In an example, the components of the management server 102 can be spread on different machines at different locations. The management server 102 is used for managing the files 104A to 104N stored at the one or more storage servers 106A to 106.
At step 202, the method 200 comprises, configuring the catalog 108 to store the first list 110, of files 104 A to 104N stored in one or more monitored servers amongst the storage servers 106A to 106N. The first list 110 comprises recording, for each listed file, the identifier 112 for identifying the listed file, one or more storage locations 114A to 114N for identifying the one or more monitored servers and the one or more locations 116A to 116N on the identified monitored servers, where the listed file is stored. The first list 110 further comprises recording, the content information 118 related to the content of the listed file. In other words, the method 200 comprises, storing a duplicate file database, such as the first list 110 of the files 104A to 104N, by the catalog 108 of the management server 102, for managing the storage of the files 104A to 104N in one or more monitored servers. For example, the catalog 108 of the management server 102 comprises, managing the storage of the files 104A to 104N at the storage server 106A and the storage server 106B. Moreover, for each listed file, the first list 110 of the catalog 108 is configured to record (or store) the content information 118 related to the content of the listed file for managing each listed file 104A and 104N. In an example, the content information 118 recorded by the first list 110 of the catalog 108 is used by the management server 102 for tracking each listed file 104A and 104N of the one or more storage server 106A to 106N. Thereafter, one or more storage locations 114A to 114N recorded by the first list 110 of the catalog 108 are used by the catalog 108 for identifying and tracking the monitored servers, such as the storage server 106 A and the storage server 106B. The one or more storage locations 114A to 114N recorded by the first list 110 are further used by the catalog 108 for identifying (and locating) the one or more locations 116A to 116N on the identified monitored servers, for determining the exact location of the listed file. For example, the storage location 114A recorded by the first list 110 is used by the catalog 108 for identifying the storage server 106A, and the location 116A is used for identifying (i.e., locating) the file 104A on the identified storage server 106A. Similarly, the storage location 114B recorded by the first list 110 is used by the catalog 108 for identifying the subsequent storage server, such as the storage server 106B and the location 116B of the subsequent files. Thereafter, the catalog 108 comprises using the identifier 112 of the first list 110 (i.e., duplicate file database) for verifying the content information 118 and also for identifying that all the listed files are duplicates of the files 104A to 104N, which are stored at one or more location 116A to 116N. In other words, the first list 110 of the catalog 108 records the identifier 112 for identifying the duplicates for each listed file of the first list 110 from the files 104A to 104N, which are stored at one or more monitored servers. In an example, the identifier 112A recorded on the first list 110 is used for identifying that all the listed files are duplicates of the files 104A to 104N, which are stored at one or more monitored servers. In another example, the identifier 112A recorded on the first list 110 is used for identifying that few of the listed files are duplicates of the files 104A to 104N, while the identifier 112B is recorded on the first list 110 for identifying that all the subsequent files are duplicates of the files 104A to 104N.
At step 204, the method 200 further comprises, detecting a change having occurred on a given file at one or more of the storage locations 114A to 114N recorded in the first list 110 for the given file, by determining the content information 118 related to the content of the changed given file and comparing it with the content information 118 recorded in the first list 110 for the given file. The first list 110 comprises, recording the content information 118 of the given file, such as the file 104A, which is same either within one storage location, such as the storage location 114A, or between one or more of the storage locations 114A to 114N. Therefore, the method 200 comprises, comparing the content of the changed given file 104A with the content information 118 recorded by the first list 110, to detect the change on the file 104A, such as using the change detection module 120 of FIG. 1A. In an example, the catalog 108 (or global unstructured data catalog) is notified of changes on the given file (or a file system) very close to the actual change. In an implementation, if the content information 118 related to the given file is not changed since the last comparison, then in such case, the change detection module 120 need not to compare the content information 118 at the subsequent storage location of the subsequent storage server. This is done as the first list 110 of the catalog 108 tracks changes to the given file, and stores down that there has not been a change since the last comparison. Beneficially, as compared to the conventional approach, the method 200 comprises improving the overall performance of the management server 102, and utilizing the resources efficiently. At step 206, the method 200 further comprises, selecting in the first list 110 one storage location for the given file amongst the storage locations 114A to 114N where the change has been detected for the given file. In an implementation, the method 200 comprises, running a same operating system at one or more storage locations 114A to 114N of the one or more storage server 106A to 106N. Moreover, each storage server 106A to 106N has the given file, such as the file 104A (e.g., a system32 file). In other words, the given file may be located in each of the storage server 106A to 106N. Therefore, if the change in the content information 118 related to the given file is detected by the change detection module 120. Then, in such a case, the management server 102 comprises using the storage location selection module 122 for selecting one storage location from the first list 110 based on the detected change in the given file. For example, the storage location selection module 122 comprises selecting the storage location 114A on the storage server 106A as recorded by the first list 110, where the change detection module 120 detects the change in the given file. Beneficially, as compared to the conventional approach, the storage location selection module 122 is not limited to one storage server, such as the storage server 106A. Further, the storage location selection module 122 uses the first list 110 of the catalog 108 for determining the exact location of the changed given file.
At step 208, the method 200 further comprises, determining whether the detected change corresponds to a malicious change by scanning the given file at the selected storage location. In an example, the method 200 comprises, changing the content information 118 of the given file, such as the file 104A (e.g., a system32 file) by a virus, on one or more of the storage server 106A to 106N (or on several of the virtual machines). Moreover, the method 200 further comprises, detecting the change in the first list 110, such as by the change detection module 120. Thereafter, the scanning module 124 (or an antivirus) comprises scanning the changed given file at the selected storage location of the changed given file for determining, if the change detected by the change detection module 120 corresponds to a malicious change or not. For example, the scanning module 124 comprises scanning the changed file 104A at the storage location 114A, and determining if the change in the changed file 104A is a malicious change or not. Moreover, next time if any changed given file shows up, the management server 102 does not need to schedule the scanning module 124 (or an antivirus scan), since the catalog 108 of the management server 102 holds the content information 118 in its database that such file is an affected version of given file. Beneficially, as compared to the conventional approach, the scanning module 124 performs the effective scans (or antivirus scans) which do not scan a duplicate file more than once. At step 210, the method 200 further comprises, marking all the files 104A to 104N, stored at the storage locations 114A to 114N where the change has been detected for the given file, as valid if the detected change is not determined as being a malicious change, and as affected if the detected change is determined as being a malicious change. In other words, the method 200 comprises, marking the files 104A to 104N of the first list 110 based on the results of the scanning module 124, such as using the marking module 126. For example, if the scanning module 124 comprises, detecting that the detected change in the file 104A stored at the location 116A is not the malicious change, then the marking module 126 comprises, marking the given file as valid (or a valid file). However, if the scanning module 124 detects that the detected change in the given file stored at the location 116A is not the malicious change, then the marking module 126 comprises marking the given file as affected (or affected file). Beneficially, as compared to the conventional approach, the catalog 108 saves the scan version, and an informed decision on which files need to be scanned can be made. Moreover, the catalog 108 creates a resource-effective organization-wide computer virus/malware protection framework and balances the storage server (or file system) scan between multiple storage servers 106A to 106N (or file systems), such that duplicate files 104A to 104N are only scanned once.
In accordance with an embodiment, the method 200 further comprises selecting in the first list 110 one storage location comprising selecting in the first list 110 a storage server, amongst the monitored servers corresponding to the storage locations 114A to 114N where the change has been detected for the given file, that is the least used one, and selecting in the first list 110 one of the locations of the given file in the selected storage server. As the change detection module 120 comprises, detecting the change in the given file at one or more storage locations 114A to 114N of the first list 110. Therefore, the method 200 comprises, using one or more storage locations 114A to 114N of the first list 110 for selecting the least used storage server among the monitored servers, where the change has been detected by the change detection module 120 for the given file, such as by the storage location selection module 122. Moreover, one or more storage locations 114A to 114N of the first list 110 are further used by the storage location selection module 122 for selecting one of the locations 116A to 116N of the given file in the selected server, where the change has been detected for the given file. In an implementation, the catalog 108 comprises using the scanning module 124 for scheduling an immediate scan of the changed file on the selected location of the least used storage server. In other words, the catalog 108 comprises scheduling an immediate antivirus scan on a machine that is not regularly used, and on a copy of the file which resides on the storage server which is the file system least used. Beneficially in comparison to the conventional approach, such scanning is the most efficient scanning since it is not affecting production or heavily used development environment, and not causing an unnecessary load on heavily used network file systems, such as the storage server other than the selected least used storage servers.
In accordance with an embodiment, the method 200 further comprises selecting in the first list 110 a storage server comprising comparing the number of reads and/or writes performed on the monitored servers corresponding to the storage locations 114A to 114N where the change has been detected for the given file, in a given period of time, and selecting in the first list 110 one of the storage servers 106 A to 106N, corresponding to the storage locations 114A to 114N where the change has been detected for the given file, for which the number of reads and/or writes is the smallest. In an example, the method 200 comprises, detecting a change on the given file in a given period of time on the monitored servers, such as using the change detection module 120. Thereafter, the storage location selection module 122 uses one or more storage locations 114A to 114N of the first list 110 for identifying the monitored servers, where the change has been detected by the change detection module 120 in a given period of time for the given file. Thereafter, the storage location selection module 122 of the management server 102 comprises comparing the number of reads and/or writes performed in a given period of time on the monitored servers. Moreover, the storage location selection module 122 comprises using the storage locations 114A to 114N for selecting one of the monitored servers for which the number of reads and/or writes of the changed given file is the smallest, for identifying how the change started and which storage server was initially affected.
In accordance with an embodiment, the method 200 comprising configuring the catalog 108 further comprises storing a second list 128 of files stored in one or more monitored servers amongst the storage servers that have been affected by a malicious change, the second list 128 recording, for each listed file, the identifier 112 for identifying the listed file, the one or more storage locations 114A to 114N for identifying the one or more monitored servers and the locations 116A to 116N on the corresponding monitored servers, where the listed file is stored, the content information 118 related to the content of the listed file. In an implementation, the method 200 comprises, determining that the detected change corresponds to the malicious change at one or more monitored servers, such as using the scanning module 124 of FIG.1 A. Thereafter, the catalog 108 stores a threat database, such as the second list 128 of the files 104A to 104N that have been affected by the malicious change, for managing the affected files. The second list 128 of the catalog 108 further records (or store) the content information 118 related to the content of the listed file for managing the affected files 104A to 104N at one or more monitored servers. Further, the catalog 108 comprises using the one or more storage locations 114A to 114N recorded by the second list 128 for identifying and tracking the content information 118 of the monitored servers, such as the storage server 106 A and the storage server 106B. The catalog 108 further comprises using the one or more storage locations 114A to 114N for identifying (and locating) the one or more locations 116A to 116N on the identified monitored servers, such as for determining the exact location of the affected files 104A to 104N. Thereafter, the catalog 108 uses the identifier 112 of the second list 128 (i.e., threat database) for verifying the content information 118 and also for verifying that all the listed files are duplicates of the files 104A to 104N, which are stored at one or more locations 116A to 116N of the monitored servers.
In accordance with an embodiment, the method 200 further comprises, when determining that the detected change corresponds to a malicious change, recording in the second list 128 the identifier 112 of the given file, the one or more storage locations 114A to 114N of the given file where the change has been detected, and the content information related to the content of the changed given file. As the second list 128 (or threat database) comprises, holding the same information as in the first list 110 (or duplicate database) but with long retention only for the files 104A to 104N positively identified in the past as threats. Therefore, the method 200 comprises, recording the information related to the files 104A to 104N into the second list 128 in comparison to the first list 110, such as using the threat recording module 130. In an example, the management server 102 uses the content information 118 recorded by the second list 128 of the catalog 108 for tracking each affected file 104 A and 104N at the one or more storage server 106A to 106N. Moreover, the content information 118 recorded by the second list 128 is used to save information about previously affected (or infected) files 104A to 104N, therefore removing the need to perform an actual scan.
In accordance with an embodiment, the method 200 further comprises recording in the second list 128, for a given listed file, and for each of the storage location 114A to 114N of the given listed file, the time of the detected change determined as malicious. In other words, the method 200 comprises, recording the time at which the malicious change is detected for a given listed file in the catalog 108, and also recording the time at which the malicious change is detected at each of the storage location 114A to 114N of the given listed file. Therefore, the method 200 comprises, identifying the initially affected file as well as the initially affected storage location, by using the recorded time of the detected malicious change.
In accordance with an embodiment, the method 200 further comprises, for a given file listed in the second list 128, sorting the corresponding storage locations 114A to 114N by time of malicious change. In other words, the method 200 comprises, sorting the storage location 114A to 114N of the given listed file by time of malicious change in the given listed file at each of the storage location 114A to 114N, such as using the sorting module 132. Beneficially, as compared to the conventional approach, the management server 102 of the method 200 comprises, identifying the first time of change (i.e., patient 0) for the given listed file.
In accordance with an embodiment, in the method 200, the first list 110 further comprises, for each listed file, a protection mark for identifying the listed file as a protected or unprotected file, wherein the method 200 further comprises, after detecting a change, verifying whether the changed given file is marked as protected in the first list 110, and wherein selecting in the first list 110 one storage location and determining whether the detected change corresponds to a malicious change is performed only if the changed given file is marked as protected in the first list 110. In an example, the method 200 comprises, monitoring changes, by the catalog 108, to critical files, such as operating files that should not change (unless a system upgrade is performed). Therefore, for such files, the protection mark is recorded by the catalog 108 in the first list 110 for each listed file for identifying the protected files and the unprotected files. Moreover, if the changed given file is marked as protected in the first list 110, then the change detection module 120 comprises triggering the storage location selection module 122 and the scanning module 124 for protecting the corresponding change file.
Therefore, the present disclosure provides the method 200 which uses the catalog 108 for a resource effective protection from computer virus or malwares. As the catalog 108 includes the first list 110, which records different information related to files 104A to 104N, therefore the management server 102 is notified of changes on the storage servers 106A to 106N very close to the actual change. Moreover, the method 200 is able to distinguish between the malicious change and the normal change. The method 200 further identifies the behavioral patterns leading to the installation of a computer virus/malware installation and avoid/wam of these patterns before actual infection. Therefore, method 200 performs the effective antivirus scans using the scanning module 124, which is not scanning the duplicate file more than once. In other words, the method 200 detects the computer virus/malware infection without actually performing an antivirus scan. Moreover, the method 200 balances the negative effects (which are present in the conventional approach) of antivirus scans using the scanning module 124 on the storage servers 106A to 106N.
FIG. 3 A is an illustration that depicts a catalog cooperation mechanism, in accordance with an embodiment of the present disclosure. FIG. 3 A is described in conjunction with elements from FIG. 1 A. With reference to FIG.3 A, there is shown an illustration 300A, that depicts the catalog 108, one or more network attached storages systems, such as a network attached storage (NAS) system 302, a network attached storage system 304 and a network attached storage system 306, a duplicate database 308, a system32 file 310, and a hash value 312.
One or more network attached storage systems 302, 304, and 306 corresponds to one or more storage servers such as storage servers 106 A to 106N of FIG. 1 A. Each of the network attached storage system 302, 304, and 306 include suitable logic, circuitry, interfaces and/or code that is configured to store and manage the system32 file 310.
The duplicate database 308 corresponds to a first list such as the first list 110 of FIG.1 A. The duplicate database 308 holds a list of the duplicate system32 file 310. The duplicate database 308 holds information about the hash value 312, change information, file attributes (e.g. system file), and the like. In an implementation, the duplicate database 308 stores information on the system32 file 310 that are the same either within a network attached storage, such as the network attached storage 302, or between one or more network attached storage systems 302, 304, and 306
The system32 files 310 corresponds to files such as the files 104A to 104N. The system32 files 310 are executable file which are stored at one or more network attached storage systems 302, 304, and 306. In an example, each of the system32 files 310 may be associated with a virtual machine. The system32 files 310 are used here in an example, but any other files (typically critical files) may be used.
The hash value 312 corresponds to an identifier such as the identifier 112 of FIG.1 A. The hash value 312 is used to identify the listed file from the duplicate database 308. There is further shown content information represented as ‘XXXX’ in the FIG.3 A. The representation ‘XXXX’ is only used for exemplary purpose. As shown in FIG. 3A, there are multiple network attached storage systems 302, 304, and 306 that are running an operating system in an organization, and each network attached storage system 302, 304, and 306 has a system file named system32 file 310. Moreover, the catalog 108 tracks changes in the network attached storage systems 302, 304, and 306. In addition, for each listed system32 file 310 of the duplicate database 308, the duplicate database 308 also records the hash value 312, and content information for each system32 file 310, as shown by "XXXX" in FIG. 3 A. The hash value 312 and the duplicate database 308 are used by the catalog 108 to identify that all of the system32 files 310 of the duplicate database 308 are duplicates of the system32 files 310, which are stored atone or more of the network attached storage systems 302, 304, and 306. For example, the hash value 312 is used by the catalog 108 to identify the network attached storage systems 302, 304, and 306, and also to identify the location of the system32 file 310 on the identified network attached storage systems 302, 304, and 306, as shown in FIG.3 A by arrows sign. Therefore, the hash value 312 and the duplicate database 308 are used by the catalog 108 to manage the system32 files 310 of the network attached storage systems 302, 304, and 306.
FIG. 3B is an illustration that depicts a catalog cooperation mechanism, in accordance with another embodiment of the present disclosure. FIG. 3B is described in conjunction with elements from FIGs. 1 A and 3 A. With reference to FIG. 3B, there is shown a hash value 314. The hash value 314 corresponds to the changed system32 file 310.
As the duplicate database 308 records the hash value 312 of the given file, such as the system32 file 310. Thereafter, if the content information of the system32 file 310 recorded in the duplicate database 308 is changed, such as changed from "XXXX" to "YYYY" as shown in FIG. 3B. Then the new hash value 314 is used to identify the changed system32 file 310 on the network attached storage systems 302, 304, and 306. Moreover, the change detection module 120 of FIG.1A compares the content information of the changed system32 file 310 with the content information recorded by the duplicate database 308, such as the content information 118 of FIG. 1, so as to detect the change on the system32 file 310. For example, the content information "YYYY" of the changed system32 file 310 is compared with the content information "XXXX" to detect the change on the changed system32 file 310.
Thereafter, the catalog 108 is notified of changes on the file 104A (or a file system) very close to the actual change, as shown by dotted lines in FIG. 3B. Further, the new hash value 314 is used to identify the location of the changed system32 file 310 on each of the network attached storage system 302, 304, and 306. For example, the hash value 314 is used to represent the changed system32 files 310, which are located at each of the network attached storage system 302, 304, and 306. Moreover, if the hash value 314 related to changed system32 file 310 is not changed since the last comparison, then in the change detection module 120 need not compare the hash value 314 at the subsequent storage location of the subsequent network attached storage system 302, 304, and 306, Beneficially, as compared to the conventional approach, the overall performance of each network attached storage system 302, 304, and 306 is improved by virtue of the catalog 108.
FIG. 3C is an illustration that depicts a catalog cooperation mechanism, in accordance with yet another embodiment of the present disclosure. FIG. 3C is described in conjunction with elements from FIGs. 1A, 3A, and 3B. With reference to FIG.3C, there is shown an antivirus 316. The antivirus 316 corresponds to the scanning module 124 of FIG. 1.
In an example, a virus may have changed the content information of the given file, such as for the system32 file 310, the content information is changed from "XXXX" to "YYYY", as shown in FIG.3C. Thereafter the hash value 314 of the changed system32 file 310 is used by the catalog to identify the location of the changed system32 file 310 (shown by dotted lines) on each of the network attached storage systems 302, 304, and 306. Moreover, if the system32 file 310 is a critical system file that is not supposed to change, then the catalog 108 starts actions to verify if this change is legitimate or if it is malicious. In an implementation, the catalog 108 schedules an immediate antivirus scan using the antivirus 316 on a specific location of the network attached storage system that is not regularly used, and on a copy of the system32 file 310 which resides on the least used network attached storage system. For example, the catalog 108 schedules a virus scan using the antivirus 316 on the network attached storage system 302 (e.g. run scan on dev.ABC.not frequently used on respective directory), as shown in FIG. 3C by dotted lines. Beneficially, in comparison with the conventional approach, such scan is the most efficient scan since it does not affect production or heavily used development environment, and does not cause an unnecessary load on heavily used network attached storage systems, such as the network attached storage systems 304 and 306.
FIG. 3D is an illustration that depicts a catalog cooperation mechanism, in accordance with another embodiment of the present disclosure. FIG. 3D is described in conjunction with elements from FIGs. 1 A, 3A, 3B, and 3C. In an implementation, if antivirus 316 detects a virus on the system32 file 310, which was already scanned by the antivirus 316, then at this stage, the catalog 108 already has the new hash value 314 of all the listed files that are affected. The catalog 108 further has information related to an original version and time of change of each affected system32 file 310, a behavioral pattern leading to the change (e.g., file ~/Downloads/I_am_a_malware.pdf added three seconds before the change), and the hash value 314 (or an identifier) of the affected system32 file 310. Moreover, based on the new hash value 314 of all the listed system32 files 310 that are affected, the catalog 108 can either proactively “fix” all the affected system32 files 310 if configured to do so or send a warning to the management server 102 with all the hash values that need to clean within the management server 102. Beneficially in comparison with the conventional approach, such information is used to receive the information for the location (or virtual machine) of the network attached storage systems 302, 304, and 306, which was initial affected, and a start process of the change (or spread). In addition, next time if the catalog 108 identifies the affected system32 file 310 or a different file name with the same contents (e.g., ~/Downloads/I_am_a_malware.pdf), then the catalog 108 take pre-emptive actions or warn that a virus is imminent. Moreover, next time the affected system32 files 310 shows up, the catalog 108 does not need to schedule an antivirus scan, since it already holds information in its database that this file is an affected version of the duplicate file of the duplicate database 308.
FIG. 3E is an illustration that depicts a catalog cooperation mechanism for reducing a load on network attached storage, in accordance with an embodiment of the present disclosure. FIG. 3D is described in conjunction with elements from FIGs. 1A, 3A, 3B, 3C and 3D. With reference to FIG.3E, there is shown a plurality of pictures directory 318, 320, 322, 324, and 326. There is further shown a number of hash values 328, 330, 332, and 334. The number of hash values, 328, 330, 332 and 324corresponds to the identifier 112 of FIG. 1 A.
The pictures directories 318, 320, 322, 324, and 326 correspond to image files, which are stored on one or more of the network attached storage systems 302, 304 and 306. For example, the pictures directory 318, 320, and 322 are stored at the network attached storage system 302. Similarly, the pictures directory 324 are stored at the network attached storage system 304, and the pictures directory 326 are stored at the network attached storage system 306.
In an implementation, the network attached storage systems 302, 304, and 306 run a number of operating systems on different virtual machines, such as the storage locations 114A to 114N. Moreover, the system32 files 310 stored at different virtual machines are same for each of the network attached storage system 302, 304 and 306, as the operating systems are also same for each of the network attached storage system 302, 304 and 306. However, the content information 118 recorded by the duplicate database 308 is different for each of the network attached storage system 302, 304, and 306. For example, the pictures directories 318, 320, 322, 324, and 326 are different for each of the network attached storage system 302, 304, and 306 as every one of the users saves his/her pictures there. Therefore, in such a case, the content information 118 recorded by the duplicate database 308 is scanned for each network attached storage system 302, 304, and 306, so that the load of the scan does not affect a single network attached storage system. Moreover, the hash values are used to identify the location of the pictures directories on each of the network attached storage system 302, 304, and 306, as shown in FIG. 3E. For example, the content information "YYYY" is scanned for the pictures directory 318 on the network attached storage system 302, which is identified using the hash value 328. Similarly, the content information "AAAA" is also scanned for the pictures directory 318 at the network attached storage system 302, which is identified using the corresponding hash value 330, and the like. Beneficially, as compared to the conventional approach, the compute required for the scan is much less than the compute which have been required in the conventional approach (i.e., if every conventional virtual machine scanned its own conventional files). In an example, if a few of the system32 files 310 have already been scanned (e.g., pictures directory) on one of the storage server (or virtual machines), and the system32 files 310 have not changed since the last scan, then such files are not needed to be scanned again, since the catalog 108 tracks changes to files, and notes down that there has not been a change since the last scan.
Modifications to embodiments of the present disclosure described in the foregoing are possible without departing from the scope of the present disclosure as defined by the accompanying claims. Expressions such as "including", "comprising", "incorporating", "have", "is" used to describe and claim the present disclosure are intended to be construed in a non-exclusive manner, namely allowing for items, components or elements not explicitly described also to be present. Reference to the singular is also to be construed to relate to the plural. The word "exemplary" is used herein to mean "serving as an example, instance or illustration". Any embodiment described as “exemplary” is not necessarily to be construed as preferred or advantageous over other embodiments or to exclude the incorporation of features from other embodiments. The word "optionally" is used herein to mean "is provided in some embodiments and not provided in other embodiments". It is appreciated that certain features of the present disclosure, which are, for clarity, described in the context of separate embodiments, may also be provided in combination in a single embodiment. Conversely, various features of the invention, which are, for brevity, described in the context of a single embodiment, may also be provided separately or in any suitable combination or as suitable in any other described embodiment of the disclosure.

Claims

1. A management server (102) for managing the storage of files (104 A to 104N) in one or more storage server (106 A to 106N) the management server (102) comprising: a catalog (108) configured to store a first list (110) of the files (104A to 104N) stored in one or more monitored servers amongst the storage servers (106 A to 106N), the first list (110) recording, for each listed file, an identifier (112) for identifying the listed file, one or more storage locations (114A to 114N) for identifying the one or more monitored servers and one or more locations (116A to 116N) on the identified monitored servers, where the listed file is stored, and a content information (118) related to the content of the listed file, a change detection module (120) configured to detect a change having occurred on a given file at one or more of the storage locations (114A to 114N) recorded in the first list (110) for the given file, by determining the content information (118) related to the content of the changed given file and comparing it with the content information (118) recorded in the first list (110) for the given file, a storage location selection module (122) configured to select in the first list (110) one storage location for a given file amongst the storage locations (114A to 114N) where a change has been detected for the given file, a scanning module (124) configured to determine whether a detected change corresponds to a malicious change by scanning a given file for which a change is detected, at a selected storage location of the given file, a marking module (126) to mark all the files (104 A to 104N), stored at the locations (116A to 116N) where a change has been detected for the given file, as valid if the detected change is not determined as being a malicious change, and as affected if the detected change is determined as being a malicious change.
2. A management server (102) according to claim 1, wherein the storage location selection module (120) is configured to select in the first list (110) a storage server, amongst the monitored servers corresponding to the storage locations (114A to 114N) where the change has been detected for a given file, that is the least used one, and selecting in the first list (110) one of the locations of the given file in the selected storage server.
3. A management server (102) according to claim 2, wherein the storage location selection module (120) is configured to compare the number of reads and/or writes performed on the monitored servers corresponding to the storage locations (114A to 114N) where the change has been detected for the given file, in a given period of time, and to select in the first list (110) one of the storage servers (106 A to 106N), corresponding to the storage locations (114A to 114N) where the change has been detected for the given file, for which the number of reads and/or writes is the smallest.
4. A management server (102) according to any of claims 1 to 3, wherein the catalog (108) is further configured to store a second list (128), of the files (104A to 104N) stored in one or more monitored servers amongst the storage servers (106 A to 106N) and that have been affected by a malicious change, the second list (128) recording, for each listed file, the identifier (112) for identifying the listed file, the one or more storage locations (114A to 114N) for identifying the one or more monitored servers and the locations (116A to 116N) on the corresponding monitored servers, where the listed file is stored, the content information (118) related to the content of the listed file.
5. A management server (102) according to claim 4, further comprising a threat recording module (130) configured to record in the second list (128), when a detected change is determined as a malicious change, the identifier (118) of the given file, the one or more storage locations (114A to 114N) of the given file where the change has been detected, and the content information (118) related to the content of the changed given file.
6. A management server (102) according to any of claims 4 and 5, wherein the catalog (108) is further configured to record in the second list (128), for a given listed file, and for each of the storage location (114A to 114N) of the given listed file, the time of a detected change determined as malicious.
7. A management server (102) according to claim 6, further comprising a sorting module (132) configured to sort, for a given file listed in the second list (128), the corresponding storage locations (114A to 114N) by time of malicious change.
8. A management server (102) according to any of claims 1 to 7, wherein the catalog (108) is further configured to record in the first list (110), for each listed file, a protection mark for identifying the listed file as a protected or unprotected file, wherein the change detection module (120) is further configured to determine if the changed given file is marked as protected in the first list (110) and to trigger the storage selection module (122) and the scanning module (124) only if the changed given file is marked as protected in the first list (110).
9. A method (200) of file storage management in a management server (102), said management server (102) comprising one or more storage servers (106A to 106N) configured to store files, the method (200) comprising: configuring a catalog (108) to store a first list (110), of files (104A to 104N) stored in one or more monitored servers amongst the storage servers (106 A to 106N), the first list (110) recording, for each listed file, an identifier (112) for identifying the listed file, one or more storage locations (114A to 114N) for identifying the one or more monitored servers and one or more locations (116A to 116N) on the identified monitored servers, where the listed file is stored, and a content information (118) related to the content of the listed file, detecting a change having occurred on a given file at one or more of the storage locations (114A to 114N) recorded in the first list (110) for the given file, by determining the content information (118) related to the content of the changed given file and comparing it with the content information (118) recorded in the first list (110) for the given file, selecting in the first list (110) one storage location for the given file amongst the storage locations (114A to 114N) where the change has been detected for the given file, determining whether the detected change corresponds to a malicious change by scanning the given file at the selected storage location, marking all the files (104 A to 104N), stored at the storage locations (116A to 116N) where the change has been detected for the given file, as valid if the detected change is not determined as being a malicious change, and as affected if the detected change is determined as being a malicious change.
10. A method (200) according to claim 9, wherein selecting in the first list (110) one storage location comprises selecting in the first list (110) a storage server, amongst the monitored servers corresponding to the storage locations (114A to 114N) where the change has been detected for the given file, that is the least used one, and selecting in the first list (110) one of the locations (116A to 116N) of the given file in the selected storage server.
11. A method (200) according to claim 10, wherein selecting in the first list (110) a storage server comprises comparing the number of reads and/or writes performed on the monitored storage servers (106 A to 106N) corresponding to the storage locations (114A to 114N) where the change has been detected for the given file, in a given period of time, and selecting in the first list (110) one of the storage servers (106A to 106N), corresponding to the storage locations (114A to 114N) where the change has been detected for the given file, for which the number of reads and/or writes is the smallest.
12. A method (200) according to any of claims 9 to 11, wherein configuring the catalog (108) further comprises storing a second list (128), of the files (104A to 104N) stored in one or more monitored servers amongst the storage servers (106 A to 106N) and that have been affected by a malicious change, the second list (128) recording, for each listed file, the identifier (112) for identifying the listed file, the one or more storage locations (114A to 114N) for identifying the one or more monitored servers and the locations (116A to 116N) on the corresponding monitored servers, where the listed file is stored, the content information (118) related to the content of the listed file.
13. A method (200) according to claim 12, further comprising, when determining that the detected change corresponds to a malicious change, recording in the second list (128) the identifier (112) of the given file, the one or more storage locations (114A to 114N) of the given file where the change has been detected, and the content information (118) related to the content of the changed given file.
14. A method (200) according to any of claims 12 and 13, further comprising recording in the second list (128), for a given listed file, and for each of the storage location (114A to 114N) of the given listed file, the time of the detected change determined as malicious.
15. A method (200) according to claim 14, further comprising, for a given file listed in the second list (128), sorting the corresponding storage locations (114A to 114N) by time of malicious change.
16. A method (200) according to any of claims 9 to 15, wherein the first list (110) further comprises, for each listed file, a protection mark for identifying the listed file as a protected or unprotected file, wherein the method (200) further comprises, after detecting a change, verifying whether the changed given file is marked as protected in the first list, and wherein selecting in the first list (110) one storage location and determining whether the detected change corresponds to a malicious change is performed only if the changed given file is marked as protected in the first list (110).
PCT/EP2021/061091 2021-04-28 2021-04-28 Management server and method for file storage management WO2022228664A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/EP2021/061091 WO2022228664A1 (en) 2021-04-28 2021-04-28 Management server and method for file storage management
CN202180096730.4A CN117099101A (en) 2021-04-28 2021-04-28 Management server and method for file storage management

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2021/061091 WO2022228664A1 (en) 2021-04-28 2021-04-28 Management server and method for file storage management

Publications (1)

Publication Number Publication Date
WO2022228664A1 true WO2022228664A1 (en) 2022-11-03

Family

ID=75746628

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2021/061091 WO2022228664A1 (en) 2021-04-28 2021-04-28 Management server and method for file storage management

Country Status (2)

Country Link
CN (1) CN117099101A (en)
WO (1) WO2022228664A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220405188A1 (en) * 2021-06-21 2022-12-22 Red Hat, Inc. Monitoring activity of an application prior to deployment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8443445B1 (en) * 2006-03-31 2013-05-14 Emc Corporation Risk-aware scanning of objects
US20170180394A1 (en) * 2015-12-16 2017-06-22 Carbonite, Inc. Systems and methods for automatic detection of malicious activity via common files
US20200285743A1 (en) * 2019-03-08 2020-09-10 Acronis International Gmbh System and method for performing an antivirus scan using file level deduplication

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8443445B1 (en) * 2006-03-31 2013-05-14 Emc Corporation Risk-aware scanning of objects
US20170180394A1 (en) * 2015-12-16 2017-06-22 Carbonite, Inc. Systems and methods for automatic detection of malicious activity via common files
US20200285743A1 (en) * 2019-03-08 2020-09-10 Acronis International Gmbh System and method for performing an antivirus scan using file level deduplication

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220405188A1 (en) * 2021-06-21 2022-12-22 Red Hat, Inc. Monitoring activity of an application prior to deployment

Also Published As

Publication number Publication date
CN117099101A (en) 2023-11-21

Similar Documents

Publication Publication Date Title
US11663031B2 (en) Techniques for securing virtual cloud assets at rest against cyber threats
US11681591B2 (en) System and method of restoring a clean backup after a malware attack
EP3036623B1 (en) Method and apparatus for modifying a computer program in a trusted manner
RU2451326C2 (en) System analysis and control
US9400886B1 (en) System and method for using snapshots for rootkit detection
RU2645268C2 (en) Complex classification for detecting malware
US8612398B2 (en) Clean store for operating system and software recovery
US7725735B2 (en) Source code management method for malicious code detection
US20140053267A1 (en) Method for identifying malicious executables
EP1605332A2 (en) Managing spyware and unwanted software through auto-start extensibility points
US11290492B2 (en) Malicious data manipulation detection using markers and the data protection layer
US11409862B2 (en) Intrusion detection and prevention for unknown software vulnerabilities using live patching
EP2245572B1 (en) Detecting rootkits over a storage area network
Pagani et al. Introducing the temporal dimension to memory forensics
US11477232B2 (en) Method and system for antivirus scanning of backup data at a centralized storage
US11537753B2 (en) Method and device for dynamic control, at file level, of the integrity of program files in a persistent memory of a computer, computer program and computer incorporating same
JP2010049627A (en) Computer virus detection system
US20230222226A1 (en) Memory scan-based process monitoring
WO2022228664A1 (en) Management server and method for file storage management
Jansen et al. Architecting dependable and secure systems using virtualization
CN110874474A (en) Lessocian virus defense method, Lessocian virus defense device, electronic device and storage medium
US11663332B2 (en) Tracking a virus footprint in data copies
US20200167463A1 (en) Out-of-Band Content Analysis
US20240231959A9 (en) Apparatus, and method
US20240187427A1 (en) User Device Agent Event Detection and Recovery

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21722439

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 202180096730.4

Country of ref document: CN

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21722439

Country of ref document: EP

Kind code of ref document: A1