WO2022172422A1 - 情報処理装置、情報処理方法及び情報処理プログラム - Google Patents
情報処理装置、情報処理方法及び情報処理プログラム Download PDFInfo
- Publication number
- WO2022172422A1 WO2022172422A1 PCT/JP2021/005370 JP2021005370W WO2022172422A1 WO 2022172422 A1 WO2022172422 A1 WO 2022172422A1 JP 2021005370 W JP2021005370 W JP 2021005370W WO 2022172422 A1 WO2022172422 A1 WO 2022172422A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- file
- static
- unit
- learning
- hash value
- Prior art date
Links
- 230000010365 information processing Effects 0.000 title claims abstract description 30
- 238000003672 processing method Methods 0.000 title claims abstract description 6
- 230000003068 static effect Effects 0.000 claims abstract description 64
- 238000000605 extraction Methods 0.000 claims abstract description 22
- 239000000284 extract Substances 0.000 claims abstract description 9
- 238000012795 verification Methods 0.000 claims description 52
- 238000000034 method Methods 0.000 claims description 24
- 230000008569 process Effects 0.000 claims description 7
- 238000012544 monitoring process Methods 0.000 description 36
- 238000012545 processing Methods 0.000 description 28
- 238000004458 analytical method Methods 0.000 description 10
- 238000010586 diagram Methods 0.000 description 8
- 230000006870 function Effects 0.000 description 7
- 230000008859 change Effects 0.000 description 6
- 230000004075 alteration Effects 0.000 description 5
- 238000012549 training Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 238000009826 distribution Methods 0.000 description 3
- 238000010187 selection method Methods 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000008685 targeting Effects 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 238000007635 classification algorithm Methods 0.000 description 1
- 238000011109 contamination Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012946 outsourcing Methods 0.000 description 1
- 239000010454 slate Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/64—Protecting data integrity, e.g. using checksums, certificates or signatures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/56—Computer malware detection or handling, e.g. anti-virus arrangements
- G06F21/562—Static detection
- G06F21/565—Static detection by checking file integrity
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Definitions
- the present invention relates to an information processing device, an information processing method, and an information processing program.
- File hash values are often used to determine software authenticity.
- a hash value is a numerical sequence having compact bit length data generated from original data, and is generated from the original data by a hash function having unidirectionality.
- a hash value generated by a hash function is a unique value generated from the original data, and if the hash values are the same, it is guaranteed to be the same data.
- the hash value at the time when no tampering has occurred is stored as the correct value in the normal state. Then, the authenticity determination program is periodically executed for the files to be determined in the device, and software authenticity determination is performed using the stored correct values.
- software authenticity determination by calculating a hash value from a file to be determined and comparing it with the correct value, unauthorized alteration of the file to be determined is detected.
- a fixed hash value is not calculated for a file whose content information is periodically updated or a file whose writing occurs due to an event such as execution of a program. Therefore, in the authenticity determination method using comparison of hash values, inclusion of such files in the files to be determined causes erroneous detection. Therefore, in authenticity judgment using hash value comparison, it is assumed that the file to be judged is a file whose file contents do not change, that is, a static file.
- static file selection methods include methods such as static analysis, dynamic analysis, and selection using snapshots.
- Static analysis is a method of excluding changed files by referring to the meta information attached to the files.
- Targeting specific software package files referring to the meta information given by the package developer for each file, excluding files with overwriting tags, that is, dynamic files where rewriting occurs, and remaining files is selected as a static file.
- selectable files are files in which meta information is defined.
- dynamic analysis is a method of monitoring system calls for a certain period of time and extracting files that have not changed as static files.
- Targeting files for which meta information is not defined such as in static analysis, monitoring the behavior of files by system calls during a certain period of monitoring time, excluding files that changed during that time, and remaining files as static files. selected as a target file.
- the monitoring time is sufficiently long, the accuracy is improved, but in the time span in actual operation, there is a possibility that files that do not change by chance during the monitoring time or files that change at a long cycle are included.
- selection using snapshots is a method of monitoring file behavior in the same way as dynamic analysis, in which snapshots of files are taken twice at regular intervals and the difference is analyzed. This method selects files that have not changed as static files. As with the dynamic analysis, this method may also include files that did not occur during the monitoring period or files with a long cycle of change.
- Patent Document 1 A technique has been proposed (Patent Document 1). A technique has also been proposed for comparing common definition information with files to be managed, determining whether or not there is a file that satisfies all the conditions of the common definition information, and extracting common definition information corresponding to the file to be managed. (Patent Document 2).
- JP 2019-8377 A Japanese Patent Application Laid-Open No. 2020-71560
- the object of judgment is limited to files with defined meta-information, and the scope of software authenticity judgment may become narrower and the accuracy may decrease.
- selection methods using dynamic analysis and snapshots are methods based on the time axis, and there is a possibility that files that did not change during the monitored time may be included. Therefore, there is a possibility that a file that should not be subject to software authenticity determination may be targeted, and the accuracy of software authenticity determination may decrease.
- the present invention has been made in view of the above, and aims to improve the accuracy of software authenticity determination and to operate the system stably.
- the learning unit learns each feature of dynamic files and static files to generate a learning model.
- the extraction unit acquires the predetermined file group at a reference point in time from an external device that uses the predetermined file group, and extracts the determination target file, which is the static file, from the predetermined file group based on the learning model. do.
- the accuracy of software authenticity determination can be improved and the system can be operated stably.
- FIG. 1 is a block diagram of an authenticity determination system according to an embodiment.
- FIG. 2 is a diagram for explaining creation of teacher data.
- FIG. 3 is a diagram illustrating an example of a result of classifying falsified files by the authenticity determination server according to the embodiment;
- FIG. 4 is a flowchart of processing in the learning phase of the authenticity determination server according to the embodiment.
- FIG. 5 is a flow chart of processing in the classification phase and normal state hash value storage phase of the authenticity determination server according to the embodiment.
- FIG. 6 is a flowchart of processing in the authenticity determination phase of the authentication server according to the embodiment.
- FIG. 7 is a diagram showing an example of a computer that executes a learning program.
- FIG. 1 is a block diagram of an authenticity determination system according to an embodiment.
- the authentication system 1 includes an authentication server 10 that is an information processing device, a monitored device 20 that is an external device to the authentication server 10, and a learning data providing device 30. .
- the authenticity determination server 10, the monitored device 20, and the learning data providing device 30 are connected via a network.
- the monitoring target device 20 is, for example, a control device arranged in an infrastructure system or a server device that provides infrastructure services.
- the monitoring target device 20 is a server that may be exposed to the threat of unauthorized alteration of software, and as a monitoring target of software authenticity detection, it is required to quickly detect and deal with unauthorized alteration to avoid a situation that is difficult to recover from. Target.
- the monitored device 20 has a hash value generator 21 and a file group 22 .
- the file group 22 is a data group used for operation of the monitoring target device 20, and includes both static files and dynamic files. File data included in the file group 22 may be subject to unauthorized alteration. That is, the file data included in the file group 22 is subject to the software authenticity determination by the authenticity determination server 10 .
- the hash value generation unit 21 receives from the authenticity determination server 10 a request to transmit the hash value of each file data included in the file group 22 of the monitored device 20 in operation.
- the hash value generator 21 has a hash function that is common to the hash function that the authentication server 10 has.
- the hash value generator 21 then calculates a hash value of each file data included in the file group 22 . After that, the hash value generator 21 outputs the calculated hash value of each file data included in the file group 22 to the authentication server 10 .
- the learning data providing device 30 is a computer that provides file data used for learning by the authenticity determination server 10 .
- the learning data providing device 30 provides the authentication server 10 with file data used for learning a software authentication learning model specified by the authentication server 10 .
- the learning data providing device 30 stores various OS (Operating System) files.
- the learning data providing device 30 has a plurality of virtual servers each holding a different OS system file.
- the authenticity determination server 10 includes a verification unit 101, a verification result registration unit 102, an authentication determination instruction unit 103, a normal state hash value storage unit 104, a hash value generation unit 105, a file information acquisition unit 106, an extraction It has a unit 107 , a teacher data creation unit 108 and a learning unit 109 .
- the teacher data creation unit 108 and the learning unit 109 perform processing for generating the determination target classification learning model 110 used when performing software authenticity determination on file data held by the monitoring target device 20 . That is, the teacher data creation unit 108 and the learning unit 109 perform processing in the learning phase of the determination target classification learning model 110 .
- the teacher data creation unit 108 acquires file data used for learning from the learning data providing device 30 .
- the teacher data creation unit 108 acquires file data of static files and file data of dynamic files specified by the administrator from the learning data providing device 30 .
- any one of the following three types or a combination thereof is selected based on the OS domain type as the file data to be used for learning.
- the first is a 64-bit version of centOS8 (registered trademark), which is an OS used in Linux (registered trademark) distributions.
- the second is a 64-bit version of Ubuntu 20.04 (registered trademark), which is also an OS used in Linux distributions and revisions.
- the third is the 64-bit version of Windows 10 (registered trademark). They are hereinafter simply referred to as "centOS”, "Ubuntu” and "Windows", respectively.
- data that can identify whether it is a static file or a dynamic file is selected from the file data of each OS. Whether certain file data is a static file or a dynamic file is determined according to the following criteria in this embodiment.
- the file data collected from centOS and Ubuntu and used for learning is determined whether it is a static file or a dynamic file with reference to the Linux file system hierarchy standard. More specifically, file data residing under specific directories that store immutable files such as static configuration files and read-only files are considered static files. For example, files under /etc/, /boot, /user/bin are considered static files. Also, file data existing under a specific directory that stores transitory or temporary files, such as file data spool files and log files, is considered a dynamic file. For example, /var The file data below is considered a dynamic file. As for Windows, file data updated one year or more ago is regarded as a static file, and file data updated less than one year ago is regarded as a dynamic file.
- file data collected from Windows and used for learning file data whose update date and time is more than one year old is considered a static file, and file data whose update date and time is less than one year old is considered a dynamic file.
- the teacher data creation unit 108 receives designation that the file data existing under a specific directory that stores immutable files is a static file. In addition, the training data creation unit 108 determines that file data existing under a specific directory that stores transient or temporary file data, among file data of centOS and Ubuntu, is a dynamic file. receive designation.
- the teaching data creation unit 108 receives static file designation for files whose update date and time is more than one year old, and dynamic file designation for files whose update date and time are less than one year old. receive.
- the teacher data creation unit 108 collects the binaries of the static files and the dynamic files in the file data group used for learning from the learning data providing device 30 according to the above designation.
- FIG. 2 is a diagram for explaining the creation of teacher data.
- the training data creation unit 108 adds a label of "1" to the collected files that can be regarded as dynamic files, and adds a label of "0" to the files that can be regarded as static files. and use it as the teacher label for each file.
- the teacher data creation unit 108 creates teacher data 202 by creating a feature amount vector representing the number of appearances of 1-byte characters expressed in hexadecimal for each file.
- the training data 202 in FIG. 2 represents a vector in which the appearance counts of hexadecimal numbers from 00 to ff for each file are arranged in order.
- the number of occurrences of bytes is used as the feature amount for each file, and the file size and the like are not taken into consideration.
- the feature amount calculation method is not limited to this, and the training data creation unit 108 may obtain the feature amount in consideration of other indices such as file size.
- the teacher data creation unit 108 outputs to the learning unit 109 the teacher data, which is binary data to which a teacher label indicating whether the file is a static file or a dynamic file is added.
- the learning unit 109 receives input of teacher data from the teacher data creation unit 108 . Then, the learning unit 109 performs learning using the acquired teacher data, performs tuning of hyperparameters that maximize classification accuracy, and generates a learned determination target classification learning model 110 . That is, the learning unit 109 learns each feature of dynamic files and static files to generate a learning model.
- the determination target classification learning model 110 corresponds to an example of a learning model. More specifically, the learning unit 109 learns each feature of the dynamic file and the static file using the binary data of the static file and the dynamic file. Also, the learning unit 109 performs learning using teacher data for each OS domain type.
- the learning unit 109 uses a classification algorithm called a support vector machine, performs parameter tuning and cross-validation by grid search, and selects a model with the highest classification accuracy as the classification learning model 110 to be judged. and After that, the learning unit 109 outputs the learned determination target classification learning model 110 to the extraction unit 107 .
- a classification algorithm called a support vector machine
- the file information acquisition unit 106 and the extraction unit 107 perform processing for classifying and extracting static files used as determination targets when performing software authenticity determination on file data held by the monitoring target device 20 . That is, the file information acquisition unit 106 and the extraction unit 107 perform processing in the classification phase of the classification learning model 110 to be judged.
- the file information acquisition unit 106 acquires from the file information acquisition unit 106 the file group 22 possessed by the monitoring target device 20 at the reference point in time when the normal state is confirmed. Then, file information acquisition section 106 outputs acquired file group 22 to extraction section 107 .
- the extraction unit 107 acquires the file group 22 from the file information acquisition unit 106. Then, the extraction unit 107 inputs the acquired file group 22 to the learned determination target classification learning model 110, and classifies it into a static file and a dynamic file. That is, the extracting unit 107 extracts a determination target file, which is a static file, from a predetermined file group based on the learning model.
- the file group 22 corresponds to an example of a predetermined file
- the determination target classification learning model 110 corresponds to an example of a learning model. More specifically, the extraction unit 107 receives an input of a predetermined file group, classifies the input predetermined file group into a static file or a dynamic file based on the learning model, and extracts the static file.
- the extraction unit 107 extracts the static files included in the file group 22 and causes the authenticity determination server 10 to hold the extracted static files as the determination target file group 120 . At this time, the extraction unit 107 adds a determination target file list representing the extracted static files to the determination target file group 120 .
- the hash value generation unit 105 and the normal state hash value storage unit 104 perform processing for securing the hash value of the determination target file in the normal state, which serves as the standard for software authenticity determination. That is, the hash value generation unit 105 and the normal state hash value storage unit 104 perform processing of the normal state hash value storage phase.
- the hash value generation unit 105 acquires each determination target file in a normal state included in the determination target file group 120 . Next, the hash value generation unit 105 calculates a hash value of each acquired determination target file. That is, the hash value generation unit 105 obtains the first hash value of each determination target file extracted by the extraction unit 107 . After that, the hash value generation unit 105 stores the hash value of each determination target file in the normal state in the normal state hash value storage unit 104 .
- the normal state hash value storage unit 104 acquires from the hash value generation unit 105 and stores the hash value calculated from the normal state determination target file at the reference time.
- the hash value calculated from the normal state determination target file will be referred to as a “normal state hash value”.
- the authenticity determination command unit 103, the verification unit 101, and the verification result registration unit 102 perform software authenticity determination processing for the file group 22 of the monitoring target device 20 during operation.
- the authenticity determination instruction unit 103, the verification unit 101, and the verification result registration unit 102 perform processing in the authentication determination phase.
- the authenticity determination command unit 103 acquires the identification information of each determination target file from the determination target file list added to the determination target file group 120 . Then, the authenticity determination command unit 103 transmits a hash value calculation request to the hash value generation unit 21 of the monitoring target device 20 together with the identification information of each determination target file. The authenticity determination command unit 103 repeats the process of starting the above software authenticity determination periodically, such as once a day.
- the verification unit 101 receives the hash value of the file group 22 owned by the monitoring target device 20 in operation from the hash value generating unit 21 of the monitoring target device 20 .
- the file group 22 owned by the monitoring target device 20 in operation is the file group 22 after the lapse of time from the reference point in time, and is the file group 22 that may have been tampered with.
- the hash value of the file group 22 owned by the monitoring target device 20 in operation is referred to as a "falsification possibility existence hash value".
- the verification unit 101 acquires the normal state hash value of each determination target file from the normal state hash value storage unit 104 . Then, the verification unit 101 compares the falsification possibility existence hash value and the normal state hash value of each determination target file, and determines whether or not the values match. Accordingly, the verification unit 101 determines whether or not the determination target file at that time and the determination target file at the reference time match. The verification unit 101 determines that the determination target file with the matching value has not been tampered with. On the other hand, the verification unit 101 determines that the file to be determined for which the values do not match has been tampered with. After that, the verification unit 101 outputs to the verification result registration unit 102 a verification result of tampering with the monitoring target device 20 , which indicates whether or not the file group 22 of the monitoring target device 20 has been tampered with.
- the verification unit 101 acquires from the external device the second hash value of each of the determination target files after the elapse of time from the reference time held by the external device, and compares the first hash value and the second hash value. Then, it is verified whether or not a predetermined file group has been tampered with.
- the monitored device 20 is an example of an external device
- the normal state hash value is an example of a first hash value
- the falsification possibility existence hash value is an example of a second hash value
- the file group 22 is a predetermined file group. It corresponds to an example of
- the verification result registration unit 102 receives from the verification unit 101 the verification result of falsification of the monitored device 20 .
- the verification result registration unit 102 registers the verification result indicating that the monitored device 20 has been tampered with in a verification result registration location of the authenticity determination server 10 .
- the administrator can confirm that the monitoring target device 20 has been illegally tampered with by confirming the registration location of the verification result in the authenticity determination server 10 .
- FIG. 3 is a diagram showing an example of a result of classifying falsified files by the authenticity determination server according to the embodiment.
- the taxonomy result of the falsified file by the authenticity determination server 10 according to the present embodiment will be described.
- a case of using file data of centOS, a case of using file data of centOS and Ubuntu, and a case of using file data of centOS, Ubuntu and Windows as a teacher data set which is the original data for creating teacher data. and three cases will be described as examples.
- the classification learning model 110 for determination is generated using file data of centOS.
- the AUC rea Under Curve
- the AUC value is 0.9 or more, and it can be said that the classification is performed with high accuracy.
- the AUC value is 0.8 or more, and it can be said that the classification is also performed with high accuracy. That is, even in the verification of file data of Ubuntu that has not been learned, classification can be performed with high accuracy, and it can be seen that there is generalization performance for the same OS domain.
- Windows file data is used as verification data, the AUC value is less than 0.5 and the classification accuracy is low.
- the classification learning model 110 for determination is generated using file data of centOS and Ubuntu.
- the AUC value is 0.9 or more, and it can be said that the classification is performed with high accuracy.
- Ubuntu file data is used as the verification data
- the AUC value is 0.9 or more, and it can be said that the classification is also performed with high accuracy. That is, it can be seen that classification can be performed with high accuracy by classifying the verification data of the same domain type with respect to the teacher data used for learning.
- Windows file data is used as verification data, the AUC is 0.6 or more, which means that the classification accuracy is somewhat good.
- centOS file data is used as verification data
- Ubuntu file data when Ubuntu file data is used
- Windows file data when Windows file data is used as verification data
- the AUC value is 0.9 or more, which is high. It can be said that the classification is done with accuracy. That is, it can be seen that classification can be performed with high accuracy by classifying the verification data of the same domain type with respect to the teacher data used for learning.
- FIG. 4 is a flowchart of processing in the learning phase of the authenticity determination server according to the embodiment. Next, with reference to FIG. 4, the flow of processing in the learning phase of the authenticity determination server 10 according to this embodiment will be described.
- the teacher data creation unit 108 acquires file data used for learning from the learning data providing device 30 .
- the teacher data creation unit 108 acquires the binary data of each static file and dynamic file in the file data group used for learning from the learning data providing device 30 according to instructions from the administrator (step S101). .
- the training data creation unit 108 adds a label representing a dynamic file or a static file to each file, and creates a feature amount vector representing the number of occurrences of 1-byte characters expressed in hexadecimal for each file. to create teacher data (step S102).
- the teacher data creation unit 108 outputs to the learning unit 109 the teacher data, which is binary data to which a teacher label indicating whether the file is a static file or a dynamic file is added.
- the learning unit 109 uses the teacher data acquired from the teacher data creation unit 108 to perform learning for tuning the hyperparameters that maximize the classification accuracy, and generates the learned determination target classification learning model 110 (step S103). .
- the learning unit 109 outputs the learned determination target classification learning model 110 to the extraction unit 107 .
- the extraction unit 107 stores the determination target classification learning model 110 acquired from the learning unit 109 (step S104).
- FIG. 5 is a flow chart of processing in the classification phase and normal state hash value storage phase of the authenticity determination server according to the embodiment. Next, the flow of processing in the classification phase and the normal state hash value storage phase of the authenticity determination server 10 according to the present embodiment will be described with reference to FIG.
- the file information acquisition unit 106 acquires from the file information acquisition unit 106 the file group 22 possessed by the monitoring target device 20 at the reference point in time when the normal state is confirmed (step S201).
- the extraction unit 107 acquires the file group 22 from the file information acquisition unit 106. Then, the extraction unit 107 inputs the obtained file group 22 to the learned determination target classification learning model 110, and classifies it into static files and dynamic files (step S202).
- the extraction unit 107 extracts the static files included in the file group 22 and causes the authenticity determination server 10 to hold the extracted static files as the determination target file group 120 .
- the hash value generation unit 105 acquires each determination target file in a normal state included in the determination target file group 120 .
- the hash value generating unit 105 calculates a normal state hash value of each acquired determination target file (step S203).
- the hash value generation unit 105 stores the hash value of each determination target file in the normal state in the normal state hash value storage unit 104 (step S204).
- FIG. 6 is a flowchart of processing in the authenticity determination phase of the authentication server according to the embodiment. Next, with reference to FIG. 6, the flow of processing in the authenticity determination phase of the authenticity determination server 10 according to the present embodiment will be described.
- the authenticity determination command unit 103 acquires the identification information of each determination target file from the determination target file list added to the determination target file group 120 . Then, the authenticity determination command unit 103 transmits a hash value calculation request together with the identification information of each determination target file to the hash value generation unit 21 of the monitoring target device 20 (step S301).
- the hash value generation unit 21 of the monitoring target device 20 acquires each file data of the file group 22 and calculates each falsification possibility existence hash value.
- the verification unit 101 receives the falsification possibility presence hash value of the file group 22 of the monitoring target device 20 in operation from the hash value generating unit 21 of the monitoring target device 20 (step S302).
- the verification unit 101 acquires the normal state hash value of each determination target file from the normal state hash value storage unit 104 . Then, the verification unit 101 compares the falsification possibility existence hash value and the normal state hash value of each determination target file, and verifies whether or not each file data has been falsified (step S303).
- the verification result registration unit 102 receives from the verification unit 101 the verification result of falsification of the monitored device 20 .
- the verification result registration unit 102 registers the verification result that the monitored device 20 has been tampered with in the verification result registration location in the authenticity determination server 10 (step S304). ).
- the authenticity determination server 10 uses a learning model generated by learning the characteristics of static files and dynamic files to extract static files from the file group 22 of the monitoring target device 20 . Extract the determination target file that is a file. Then, the authenticity determination server 10 according to the present embodiment uses the normal state hash value obtained from the determination target file at the reference point in time when it can be assumed that no falsification has been performed, and the determination target file possessed by the monitoring target device 20 in operation. is compared with the tampering possibility presence hash value obtained from , to detect tampering with the monitored device 20 .
- static files can be easily and comprehensively extracted from the file group 22 of the monitoring target device 20 .
- software authenticity determination using the extracted static files as files to be determined, a wide range of software authenticity determination can be secured, and files that are not subject to software authenticity determination can be excluded from the files to be determined. can. Therefore, it is possible to improve the accuracy of software authenticity determination and stably operate the system.
- each component of each device illustrated is functionally conceptual, and does not necessarily need to be physically configured as illustrated.
- the specific form of distribution and integration of each device is not limited to the illustrated one, and all or part of them can be functionally or physically distributed or Can be integrated and configured.
- the monitoring target device 20 it is possible to configure the monitoring target device 20 to include the normal state hash value storage unit 104 and the verification unit 101 .
- the verification unit 101 is arranged in the monitoring target device 20 transmits the verification result to the authentication server 10, and the authentication server 10 registers the obtained verification result.
- all or any part of each processing function performed by each device is realized by a CPU (Central Processing Unit) and a program analyzed and executed by the CPU, or hardware by wired logic can be realized as
- CPU Central Processing Unit
- the authentication server 10 can be implemented by installing an information processing program for executing the above information processing as package software or online software on a desired computer.
- the information processing device can function as the authenticity determination server 10 by causing the information processing device to execute the above information processing program.
- the information processing apparatus referred to here may include a desktop or notebook personal computer in addition to the server computer.
- information processing devices include mobile communication terminals such as smartphones, mobile phones and PHS (Personal Handy-phone Systems), and slate terminals such as PDA (Personal Digital Assistant).
- the authentication server 10 can also be implemented as a management server device that uses a terminal device used by a user as a client and provides the client with services related to the above-described management processing.
- the management server device is implemented as a server device that receives a configuration input request and provides a management service for inputting the configuration.
- the management server device may be implemented as a Web server, or may be implemented as a cloud that provides services related to the above management processing by outsourcing.
- FIG. 7 is a diagram showing an example of a computer that executes a learning program.
- the computer 1000 has a memory 1010 and a CPU 1020, for example.
- Computer 1000 also has hard disk drive interface 1030 , disk drive interface 1040 , serial port interface 1050 , video adapter 1060 and network interface 1070 . These units are connected by a bus 1080 .
- the memory 1010 includes a ROM (Read Only Memory) 1011 and a RAM (Random Access Memory) 1012 .
- the ROM 1011 stores a boot program such as BIOS (BASIC Input Output System).
- BIOS BASIC Input Output System
- Hard disk drive interface 1030 is connected to hard disk drive 1090 .
- a disk drive interface 1040 is connected to the disk drive 1100 .
- a removable storage medium such as a magnetic disk or optical disk is inserted into the disk drive 1100 .
- Serial port interface 1050 is connected to mouse 1110 and keyboard 1120, for example.
- Video adapter 1060 is connected to display 1130, for example.
- the hard disk drive 1090 stores, for example, an OS 1091, application programs 1092, program modules 1093, and program data 1094. That is, a learning program defining each process of the authentication server 10 having functions equivalent to those of the authentication server 10 is implemented as a program module 1093 in which computer-executable code is described.
- Program modules 1093 are stored, for example, on hard disk drive 1090 .
- the hard disk drive 1090 stores a program module 1093 for executing processing similar to the functional configuration of the authentication server 10 .
- the hard disk drive 1090 may be replaced by an SSD (Solid State Drive).
- the setting data used in the processing of the above-described embodiment is stored as program data 1094 in the memory 1010 or the hard disk drive 1090, for example. Then, the CPU 1020 reads the program modules 1093 and program data 1094 stored in the memory 1010 and the hard disk drive 1090 to the RAM 1012 as necessary, and executes the processes of the above-described embodiments.
- the program modules 1093 and program data 1094 are not limited to being stored in the hard disk drive 1090, but may be stored in a removable storage medium, for example, and read by the CPU 1020 via the disk drive 1100 or the like. Alternatively, the program modules 1093 and program data 1094 may be stored in another computer connected via a network (LAN (Local Area Network), WAN (Wide Area Network), etc.). Program modules 1093 and program data 1094 may then be read by CPU 1020 through network interface 1070 from other computers.
- LAN Local Area Network
- WAN Wide Area Network
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Computer Hardware Design (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Bioethics (AREA)
- Virology (AREA)
- Storage Device Security (AREA)
- Debugging And Monitoring (AREA)
- Stored Programmes (AREA)
- Computer And Data Communications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
[真贋判定システム]
図1は、実施形態に係る真贋判定システムのブロック図である。図1に示すように、本実施形態に係る真贋判定システム1は、情報処理装置である真贋判定サーバ10、真贋判定サーバ10に対する外部機器である監視対象機器20及び学習用データ提供装置30を有する。真贋判定サーバ10、監視対象機器20及び学習用データ提供装置30は、ネットワークを介して接続される。
真贋判定サーバ10について説明する。真贋判定サーバ10は、図1に示すように、検証部101、検証結果登録部102、真贋判定命令部103、正常状態ハッシュ値格納部104、ハッシュ値生成部105、ファイル情報取得部106、抽出部107、教師データ作成部108及び学習部109を有する。
図4は、実施形態に係る真贋判定サーバの学習フェーズにおける処理のフローチャートである。次に、図4を参照して、本実施形態に係る真贋判定サーバ10の学習フェーズにおける処理の流れを説明する。
図5は、実施形態に係る真贋判定サーバの分類フェーズ及び正常状態ハッシュ値保存フェーズにおける処理のフローチャートである。次に、図5を参照して、本実施形態に係る真贋判定サーバ10の分類フェーズ及び正常状態ハッシュ値保存フェーズにおける処理の流れを説明する。
図6は、実施形態に係る真贋判定サーバの真贋判定フェーズにおける処理のフローチャートである。次に、図6を参照して、本実施形態に係る真贋判定サーバ10の真贋判定フェーズにおける処理の流れを説明する。
以上に説明したように、本実施形態に係る真贋判定サーバ10は、静的ファイル及び動的ファイルの特徴を学習して生成した学習モデルを用いて監視対象機器20が有するファイル群22から静的ファイルである判定対象ファイルを抽出する。そして、本実施形態に係る真贋判定サーバ10は、改ざんが行われていないとみなせる基準時点での判定対象ファイルから求められた正常状態ハッシュ値と、運用中の監視対象機器20が有する判定対象ファイルから求められる改ざん可能性存在ハッシュ値とを比較して監視対象機器20に対する改ざんを検出する。
図示した各装置の各構成要素は機能概念的なものであり、必ずしも物理的に図示のように構成されていることを要しない。すなわち、各装置の分散及び統合の具体的形態は図示のものに限られず、その全部又は一部を、各種の負荷や使用状況等に応じて、任意の単位で機能的又は物理的に分散又は統合して構成することができる。特に、正常状態ハッシュ値格納部104や検証部101を監視対象機器20が備えるように構成することも可能である。検証部101が監視対象機器20に配置される場合,監視対象機器20が検証結果を真贋判定サーバ10に送信し、真贋判定サーバ10は、取得した検証結果を登録する。さらに、各装置にて行われる各処理機能は、その全部又は任意の一部が、CPU(Central Processing Unit)及び当該CPUにて解析実行されるプログラムにて実現され、あるいは、ワイヤードロジックによるハードウェアとして実現され得る。
一実施形態として、真贋判定サーバ10は、パッケージソフトウェアやオンラインソフトウェアとして上記の情報処理を実行する情報処理プログラムを所望のコンピュータにインストールさせることによって実装できる。例えば、上記の情報処理プログラムを情報処理装置に実行させることにより、情報処理装置を真贋判定サーバ10として機能させることができる。ここで言う情報処理装置には、サーバコンピュータ以外にも、デスクトップ型又はノート型のパーソナルコンピュータが含まれても良い。また、その他にも、情報処理装置にはスマートフォン、携帯電話機やPHS(Personal Handy-phone System)等の移動体通信端末、さらには、PDA(Personal Digital Assistant)等のスレート端末等がその範疇に含まれる。
10 真贋判定サーバ
20 監視対象機器
21 ハッシュ値生成部
22 ファイル群
30 学習用データ提供装置
101 検証部
102 検証結果登録部
103 真贋判定命令部
104 正常状態ハッシュ値格納部
105 ハッシュ値生成部
106 ファイル情報取得部
107 抽出部
108 教師データ作成部
109 学習部
110 判定対象分類学習モデル
120 判定対象ファイル群
Claims (7)
- 動的ファイル及び静的ファイルの各特徴を学習して学習モデルを生成する学習部と、
所定のファイル群を使用する外部機器から基準時点での前記所定のファイル群を取得し、前記学習モデルを基に前記所定のファイル群から前記静的ファイルである判定対象ファイルを抽出する抽出部と
を備えたことを特徴とする情報処理装置。 - 前記学習部は、前記静的ファイル及び前記動的ファイルのバイナリデータを用いて、前記動的ファイル及び前記静的ファイルの各前記特徴を学習することを特徴とする請求項1に記載の情報処理装置。
- 前記学習部は、Operating System(OS)ドメイン種別毎の教師データを用いて前記学習を行うことを特徴とする請求項1又は2に記載の情報処理装置。
- 前記抽出部は、前記所定のファイル群の入力を受けて、前記学習モデルを基に入力された前記所定のファイル群を前記静的ファイル又は前記動的ファイルに分類して前記静的ファイルを抽出することを特徴とする請求項1~3のいずれか一つに記載の情報処理装置。
- 前記抽出部により抽出された各前記判定対象ファイルのそれぞれの第1ハッシュ値を求めるハッシュ値生成部と、
前記外部機器が有する前記基準時点から時間経過後の前記判定対象ファイルのそれぞれの第2ハッシュ値を前記外部機器から取得し、前記第1ハッシュ値と前記第2ハッシュ値とを比較して、前記所定のファイル群に改ざんが行われたか否かを検証する検証部と
をさらに備えたことを特徴とする請求項1~4のいずれか一つに記載の情報処理装置。 - 動的ファイル及び静的ファイルの各特徴を学習して学習モデルを生成し、
所定のファイル群を使用する外部機器から基準時点での前記所定のファイル群を取得し、
前記学習モデルを基に前記所定のファイル群から前記静的ファイルである判定対象ファイルを抽出する
ことを特徴とする情報処理方法。 - 動的ファイル及び静的ファイルの各特徴を学習して学習モデルを生成し、
所定のファイル群を使用する外部機器から基準時点での前記所定のファイル群を取得し、
前記学習モデルを基に前記所定のファイル群から前記静的ファイルである判定対象ファイルを抽出する
処理をコンピュータ実行させることを特徴とする情報処理プログラム。
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP21925673.2A EP4276670A1 (en) | 2021-02-12 | 2021-02-12 | Information processing device, information processing method, and information processing program |
CN202180093035.2A CN116830106A (zh) | 2021-02-12 | 2021-02-12 | 信息处理装置、信息处理方法和信息处理程序 |
PCT/JP2021/005370 WO2022172422A1 (ja) | 2021-02-12 | 2021-02-12 | 情報処理装置、情報処理方法及び情報処理プログラム |
AU2021427822A AU2021427822B2 (en) | 2021-02-12 | 2021-02-12 | Information processing device, information processing method, and information processing program |
JP2022581133A JPWO2022172422A1 (ja) | 2021-02-12 | 2021-02-12 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2021/005370 WO2022172422A1 (ja) | 2021-02-12 | 2021-02-12 | 情報処理装置、情報処理方法及び情報処理プログラム |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022172422A1 true WO2022172422A1 (ja) | 2022-08-18 |
Family
ID=82838490
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2021/005370 WO2022172422A1 (ja) | 2021-02-12 | 2021-02-12 | 情報処理装置、情報処理方法及び情報処理プログラム |
Country Status (5)
Country | Link |
---|---|
EP (1) | EP4276670A1 (ja) |
JP (1) | JPWO2022172422A1 (ja) |
CN (1) | CN116830106A (ja) |
AU (1) | AU2021427822B2 (ja) |
WO (1) | WO2022172422A1 (ja) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180203978A1 (en) * | 2017-01-13 | 2018-07-19 | Microsoft Technology Licensing, Llc | Machine-learning models for predicting decompensation risk |
JP2019008377A (ja) | 2017-06-20 | 2019-01-17 | 日本電信電話株式会社 | 照合情報生成装置、管理システム及び照合情報生成方法 |
US20190034632A1 (en) * | 2017-07-25 | 2019-01-31 | Trend Micro Incorporated | Method and system for static behavior-predictive malware detection |
JP2020071560A (ja) | 2018-10-30 | 2020-05-07 | 日本電信電話株式会社 | 管理システム、取得装置及び管理方法 |
JP2020523707A (ja) * | 2017-06-16 | 2020-08-06 | ホアウェイ・テクノロジーズ・カンパニー・リミテッド | ユーザプロファイル生成方法および端末 |
-
2021
- 2021-02-12 AU AU2021427822A patent/AU2021427822B2/en active Active
- 2021-02-12 WO PCT/JP2021/005370 patent/WO2022172422A1/ja active Application Filing
- 2021-02-12 CN CN202180093035.2A patent/CN116830106A/zh active Pending
- 2021-02-12 EP EP21925673.2A patent/EP4276670A1/en active Pending
- 2021-02-12 JP JP2022581133A patent/JPWO2022172422A1/ja active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180203978A1 (en) * | 2017-01-13 | 2018-07-19 | Microsoft Technology Licensing, Llc | Machine-learning models for predicting decompensation risk |
JP2020523707A (ja) * | 2017-06-16 | 2020-08-06 | ホアウェイ・テクノロジーズ・カンパニー・リミテッド | ユーザプロファイル生成方法および端末 |
JP2019008377A (ja) | 2017-06-20 | 2019-01-17 | 日本電信電話株式会社 | 照合情報生成装置、管理システム及び照合情報生成方法 |
US20190034632A1 (en) * | 2017-07-25 | 2019-01-31 | Trend Micro Incorporated | Method and system for static behavior-predictive malware detection |
JP2020071560A (ja) | 2018-10-30 | 2020-05-07 | 日本電信電話株式会社 | 管理システム、取得装置及び管理方法 |
Also Published As
Publication number | Publication date |
---|---|
EP4276670A1 (en) | 2023-11-15 |
JPWO2022172422A1 (ja) | 2022-08-18 |
AU2021427822B2 (en) | 2024-05-09 |
CN116830106A (zh) | 2023-09-29 |
AU2021427822A1 (en) | 2023-08-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhan et al. | Atvhunter: Reliable version detection of third-party libraries for vulnerability identification in android applications | |
CN102521081B (zh) | 修复遭破坏的软件 | |
US8627469B1 (en) | Systems and methods for using acquisitional contexts to prevent false-positive malware classifications | |
JP2021518705A (ja) | ブロックチェーン台帳のためのランタイム自己修正 | |
Sejfia et al. | Practical automated detection of malicious npm packages | |
US20120311709A1 (en) | Automatic management system for group and mutant information of malicious codes | |
US20130111018A1 (en) | Passive monitoring of virtual systems using agent-less, offline indexing | |
Huang et al. | Android malware development on public malware scanning platforms: A large-scale data-driven study | |
Fu et al. | Data correlation‐based analysis methods for automatic memory forensic | |
CN112988607B (zh) | 一种应用程序的组件检测方法、装置和存储介质 | |
WO2022225686A1 (en) | Automated contextual understanding of unstructured security documents | |
Rowe | Identifying forensically uninteresting files using a large corpus | |
CN113590181A (zh) | 配置文件的校验方法、装置、设备及存储介质 | |
CN112579330B (zh) | 操作系统异常数据的处理方法、装置及设备 | |
US11151250B1 (en) | Evaluation of files for cybersecurity threats using global and local file information | |
CN110008108B (zh) | 回归范围确定方法、装置、设备及计算机可读存储介质 | |
CN104657504A (zh) | 一种文件快速识别方法 | |
WO2022172422A1 (ja) | 情報処理装置、情報処理方法及び情報処理プログラム | |
US20240037243A1 (en) | Artificial intelligence based security requirements identification and testing | |
EP4386597A1 (en) | Cyber threat information processing device, cyber threat information processing method, and storage medium storing cyber threat information processing program | |
EP3799367B1 (en) | Generation device, generation method, and generation program | |
Imtiaz et al. | Predicting vulnerability for requirements | |
JP5679347B2 (ja) | 障害検知装置、障害検知方法、及びプログラム | |
CN114175034A (zh) | 验证信息生成系统、验证信息生成方法以及验证信息生成程序 | |
KR20120031963A (ko) | 악성 코드 차단 장치 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21925673 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2022581133 Country of ref document: JP Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2021427822 Country of ref document: AU Date of ref document: 20210212 Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202180093035.2 Country of ref document: CN |
|
ENP | Entry into the national phase |
Ref document number: 2021925673 Country of ref document: EP Effective date: 20230807 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |