CN107566349B - Method and computing device for detecting sensitive file leakage in network server - Google Patents

Method and computing device for detecting sensitive file leakage in network server Download PDF

Info

Publication number
CN107566349B
CN107566349B CN201710693457.9A CN201710693457A CN107566349B CN 107566349 B CN107566349 B CN 107566349B CN 201710693457 A CN201710693457 A CN 201710693457A CN 107566349 B CN107566349 B CN 107566349B
Authority
CN
China
Prior art keywords
file
sensitive
network server
file name
list
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710693457.9A
Other languages
Chinese (zh)
Other versions
CN107566349A (en
Inventor
谢小强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Knownsec Information Technology Co Ltd
Original Assignee
Beijing Knownsec Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Knownsec Information Technology Co Ltd filed Critical Beijing Knownsec Information Technology Co Ltd
Priority to CN201710693457.9A priority Critical patent/CN107566349B/en
Publication of CN107566349A publication Critical patent/CN107566349A/en
Application granted granted Critical
Publication of CN107566349B publication Critical patent/CN107566349B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method for detecting sensitive file leakage in a network server, which comprises the following steps: carrying out application identification on the network server to acquire application characteristics of the network server; creating a first list of compressed class sensitive files and a second list of uncompressed class sensitive files; generating a file name of a compressed sensitive file corresponding to the application characteristic according to the application characteristic of the acquired network server, adding the file name into the first list, generating a file name and a regular expression of a non-compressed sensitive file corresponding to the application characteristic, and adding the file name and the regular expression into the second list in a correlation manner; and initiating requests to the network server one by one for each file name in the first list and the second list, and judging whether sensitive files are leaked from the network server according to a response result of the network server. The invention also discloses a computing device for executing the method.

Description

Method and computing device for detecting sensitive file leakage in network server
Technical Field
The invention relates to the field of internet, in particular to a method and computing equipment for detecting sensitive file leakage in a network server.
Background
The web server (website) sensitive file refers to a backup file, a test file, a log file, a configuration file and the like of a website, wherein the backup file, the test file, the log file, the configuration file and the like contain website sensitive information. Once these sensitive files are obtained by an attacker, the attacker can use them to launch further attacks on the website, with immeasurable losses to individual or enterprise users. In order to guarantee the safe operation of the website, whether the sensitive files of the website are leaked or not needs to be detected.
In the prior art, common and general backup files, configuration files, log files and the like of some websites, such as www.rar, config.php.bak and log.txt, are collected and then combined into a sensitive file list, and finally, requests are initiated to the websites one by one for sensitive files in the list, and whether sensitive files are leaked from the websites is judged according to the content of responses. And detecting that the whole sensitive file list needs to initiate a large number of requests to the website, which causes network resource waste. In fact, a large number of requests are not necessarily sent, for example, a site using a Hypertext Preprocessor (PHP) does not have sensitive files such as test files, debugging files, installation files and the like of other programming languages, so that the part of the requests is not necessarily sent. In addition, the sensitive file list is fixed and invariable for all websites, and the sensitive files are easy to be leaked and reported.
Therefore, a scheme for detecting whether a sensitive file in a network server is leaked or not, which can save network resources and reduce false positives, is needed.
Disclosure of Invention
To this end, the present invention provides a computing device in an attempt to solve or at least alleviate at least one of the problems identified above.
According to one aspect of the invention, a method for detecting leakage of a sensitive file in a network server is provided, which is executed in a computing device and comprises the following steps: carrying out application identification on the network server to acquire application characteristics of the network server; creating a first list of compressed class sensitive files and a second list of uncompressed class sensitive files, wherein each data entry of the first list is a file name of a compressed class sensitive file; each data item of the second list is an incidence relation between a file name of a non-compression sensitive file and a regular expression, wherein sensitive information corresponding to the sensitive file with the file name can be matched according to the regular expression; according to the obtained application characteristics of the network server, generating a file name of a compressed sensitive file corresponding to the application characteristics, and adding the file name into the first list, and/or generating a file name and a regular expression of a non-compressed sensitive file corresponding to the application characteristics, and adding the file name and the regular expression into the second list in a correlation manner; and initiating requests to the network server one by one for each file name in the first list and the second list, and judging whether sensitive files are leaked from the network server according to a response result of the network server.
Optionally, in the method for detecting sensitive file leakage in a network server according to the present invention, the application characteristics of the network server include one or more of the following characteristics: domain name, programming language used, CMS type, and server type.
Optionally, in the method for detecting leakage of a sensitive file in a network server according to the present invention, the first list initially includes a total-station backup file name, a database backup file name, and a log backup file name that are unrelated to an application characteristic of the network server; the second list initially includes database creation filenames, debug filenames, log filenames, and test filenames that are independent of application characteristics of the web server, as well as regular expressions associated with each filename.
Optionally, in the method for detecting leakage of a sensitive file in a network server according to the present invention, the sensitive information corresponding to the sensitive file includes content included in the sensitive file and content output by the network server executing the sensitive file.
Optionally, in the method for detecting sensitive file leakage in a web server according to the present invention, the application characteristic includes a domain name of the web server, and the step of generating a file name of a compressed class sensitive file corresponding to the application characteristic includes: and generating a derivative word comprising the domain name according to the domain name, and splicing the derivative word with a common compressed file suffix to generate a plurality of file names.
Optionally, in the method for detecting leakage of a sensitive file in a network server according to the present invention, the application characteristic includes a programming language used by the network server, and the step of generating a file name and a regular expression of an uncompressed class sensitive file corresponding to the application characteristic includes: and generating an installation file name, a test file name and a debugging file name corresponding to the programming language, and generating a related regular expression according to sensitive information output by a sensitive file with the file names executed by a network server.
Optionally, in the method for detecting sensitive file leakage in a web server according to the present invention, the application characteristic includes a CMS type of the web server, and the step of generating a file name and a regular expression of an uncompressed class sensitive file corresponding to the application characteristic includes: generating a configuration file name corresponding to the CMS type, combining the configuration file name with a common backup suffix to generate a plurality of file names, and generating an associated regular expression according to sensitive information included in sensitive files with the file names.
Optionally, in the method for detecting sensitive file leakage in a network server according to the present invention, the application characteristic includes a server type of the network server, and the step of generating a file name and a regular expression of an uncompressed class sensitive file corresponding to the application characteristic includes: generating a configuration file name corresponding to the type of the server, combining the configuration file name with a common backup suffix to generate a plurality of file names, and generating a related regular expression according to sensitive information included in sensitive files with the file names.
Optionally, in the method for detecting sensitive file leakage in a network server according to the present invention, the step of determining whether the sensitive file leakage exists in the network server according to a response result of the network server includes: and judging whether the Content of the Content-Type in the response header exists in a Content-Type header list commonly used by the compressed file or not according to a response result corresponding to the file name in the first list, and if so, determining that the leakage of the sensitive file with the file name exists in the network server.
Optionally, in the method for detecting sensitive file leakage in a network server according to the present invention, the step of determining whether the sensitive file leakage exists in the network server according to a response result of the network server includes: and judging whether the state code in the response header is a state code representing that the network server successfully processes the request or not according to a response result corresponding to the file name in the second list, if so, further matching the response content with the regular expression associated with the file name, and if matching is successful, determining that the network server leaks the sensitive file with the file name.
Optionally, in the method for detecting sensitive file leakage in a web server according to the present invention, a web application identification tool is used to perform application identification on the web server, where the web application identification tool is whatge, blindelheat, or wap vendor.
According to yet another aspect of the present invention, there is provided a computing device comprising: one or more processors; and a memory; one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for performing any of the methods described above.
According to a further aspect of the invention there is provided a computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a computing device, cause the computing device to perform any of the methods described above.
According to the scheme for detecting the sensitive file leakage in the network server, the website is simply collected, the possibly existing sensitive file list is generated by utilizing a specific rule according to the collected information, and the website is enumerated, so that whether the sensitive file leakage exists in the target website is detected, the range of the enumerated sensitive files is reduced to a certain extent, the missing report and the wrong report are reduced, and the detection efficiency is improved.
Drawings
To the accomplishment of the foregoing and related ends, certain illustrative aspects are described herein in connection with the following description and the annexed drawings, which are indicative of various ways in which the principles disclosed herein may be practiced, and all aspects and equivalents thereof are intended to be within the scope of the claimed subject matter. The above and other objects, features and advantages of the present disclosure will become more apparent from the following detailed description read in conjunction with the accompanying drawings. Throughout this disclosure, like reference numerals generally refer to like parts or elements.
FIG. 1 shows a schematic diagram of a configuration of a computing device 100 according to one embodiment of the invention;
FIG. 2 illustrates a flow diagram of a method 200 for detecting sensitive file leaks in a network server, according to one embodiment of the present invention;
FIG. 3 illustrates a screen shot of a request and response in the presence of a compressed class sensitive file in a web server;
FIG. 4 illustrates a screen shot of a request and response in the absence of a compressed class sensitive file in a web server;
FIG. 5 illustrates a screen shot of a request and response in the presence of an uncompressed class sensitive file in a web server;
FIG. 6 illustrates a screen shot of a request and response in the absence of an uncompressed class sensitive file in a web server.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
Fig. 1 is a block diagram of an example computing device 100. In a basic configuration 102, computing device 100 typically includes system memory 106 and one or more processors 104. A memory bus 108 may be used for communication between the processor 104 and the system memory 106.
Depending on the desired configuration, the processor 104 may be any type of processing, including but not limited to: a microprocessor (μ P), a microcontroller (μ C), a Digital Signal Processor (DSP), or any combination thereof. The processor 104 may include one or more levels of cache, such as a level one cache 110 and a level two cache 112, a processor core 114, and registers 116. The example processor core 114 may include an Arithmetic Logic Unit (ALU), a Floating Point Unit (FPU), a digital signal processing core (DSP core), or any combination thereof. The example memory controller 118 may be used with the processor 104, or in some implementations the memory controller 118 may be an internal part of the processor 104.
Depending on the desired configuration, system memory 106 may be any type of memory, including but not limited to: volatile memory (such as RAM), non-volatile memory (such as ROM, flash memory, etc.), or any combination thereof. System memory 106 may include an operating system 120, one or more applications 122, and program data 124. In some embodiments, application 122 may be arranged to operate with program data 124 on an operating system. In some embodiments, computing device 100 is configured to perform method 200 of detecting sensitive file leaks in a network server. Program data 124 includes instructions for performing the method 200.
Computing device 100 may also include an interface bus 140 that facilitates communication from various interface devices (e.g., output devices 142, peripheral interfaces 144, and communication devices 146) to the basic configuration 102 via the bus/interface controller 130. The example output device 142 includes a graphics processing unit 148 and an audio processing unit 150. They may be configured to facilitate communication with various external devices, such as a display or speakers, via one or more a/V ports 152. Example peripheral interfaces 144 may include a serial interface controller 154 and a parallel interface controller 156, which may be configured to facilitate communication with external devices such as input devices (e.g., keyboard, mouse, pen, voice input device, image input device) or other peripherals (e.g., printer, scanner, etc.) via one or more I/O ports 158. An example communication device 146 may include a network controller 160, which may be arranged to facilitate communications with one or more other computing devices 162 over a network communication link via one or more communication ports 164.
A network communication link may be one example of a communication medium. Communication media may typically be embodied by computer readable instructions, data structures, program modules, and may include any information delivery media, such as carrier waves or other transport mechanisms, in a modulated data signal. A "modulated data signal" may be a signal that has one or more of its data set or its changes made in such a manner as to encode information in the signal. By way of non-limiting example, communication media may include wired media such as a wired network or private-wired network, and various wireless media such as acoustic, Radio Frequency (RF), microwave, Infrared (IR), or other wireless media. The term computer readable media as used herein may include both storage media and communication media. In some embodiments, one or more programs are stored in a computer-readable medium, the one or more programs including instructions for performing certain methods, such as the method 200 for detecting sensitive file leaks in a network server, performed by the computing device 100 according to embodiments of the present invention.
Computing device 100 may be implemented as part of a small-form factor portable (or mobile) electronic device such as a cellular telephone, a Personal Digital Assistant (PDA), a personal media player device, a wireless web-watch device, a personal headset device, an application specific device, or a hybrid device that include any of the above functions. Computing device 100 may also be implemented as a personal computer including both desktop and notebook computer configurations.
FIG. 2 shows a flow diagram of a method 200 for detecting sensitive file leaks in a network server, according to one embodiment of the invention.
As shown in fig. 2, the method begins at step S210. In step S210, application recognition is performed on the Web server (Web site) by the Web application recognition tool, so that the application characteristics of the Web server are acquired. The application characteristics of the web server are basic information such as a domain name of a website, a programming language that may be used, a CMS (Content Management System) type, a server type, and the like.
In the embodiment of the present invention, a common Web application identification tool, such as whatgen, blindeplephant, wap analyzer, etc., may be used to identify the application of the target website. Assuming the website domain name is http:// www.toyean.com/, application identification is performed using the WhatWeb application identification. Inputting a website domain name in a whatpweb application recognition tool, initiating a request to a target website, and obtaining the following results returned by the target website:
root@localhost:~/WhatWeb#./whatweb http://www.toyean.com/
http:// www.toyean.com/[200OK ] Apache, county [ CHINA ] [ CN ], HTTPServer [ Apache ], IP [123.57.234.80], JQuery [1.8.3], MetaGenerator [ Z-BlogPHP 1.5.1Zero ], PHP [5.2.17], PoweredBy [ ZBill ], Script [ JavaScript, text/JavaScript ], Title [ TM ] Title [ TM ] Website-professional ZBlog theme original website ], Uncommonheads [ product ], X-Powered-By [ PHP/5.2.17], X-UA-Compatible [ ie ═ edge ]
As can be seen from the above results, the programming language used by the website is PHP, CMS type is Z-Blog, and server type is Apache.
In step S220, two basic lists of sensitive files are created, one is a list of compressed class sensitive files (referred to herein as a first list) and the other is a list of uncompressed class sensitive files (referred to herein as a second list).
The first list includes a plurality of data entries, each data entry being a filename of the compressed class-sensitive file. The first list initially includes file names of compressed class sensitive files that are not related to application characteristics of the network server, such as total-station backup file names, database backup file names, log backup file names, and the like. These file names may be collected and stored in the computing device as needed by those skilled in the art. One example of compressing a type-sensitive file list is as follows:
the second list also comprises a plurality of data entries, and each data entry is the incidence relation between the file name of the non-compression sensitive file and the regular expression. That is, the data entries stored in the second list are in the format of "filename: regular expression". The sensitive information corresponding to the sensitive file with the file name can be matched according to the regular expression, and the sensitive information corresponding to the sensitive file comprises content included in the sensitive file and content output by the network server executing the sensitive file. Regular expression (regular expression) describes a pattern (pattern) for matching a character string, which can be used to check whether a string contains a certain substring, to replace the matching substring, or to extract a substring that meets a certain condition from a certain string, etc.
The second list initially includes file names of uncompressed class-sensitive files that are not related to application characteristics of the network server, and regular expressions respectively associated with the file names. The uncompressed class-sensitive files that are not related to the application characteristics of the web server, such as creating file names for the database, debugging file names, log file names, and test file names, etc., can be collected and stored in the computing device in advance as needed by those skilled in the art. Moreover, the sensitive information corresponding to the uncompressed class sensitive file can also be collected in advance by a person skilled in the art, and a regular expression for matching whether the sensitive information exists in a certain text or not can be generated according to the collected sensitive information. One example of a non-compressed type sensitive file list is as follows:
it should be noted that, in the embodiment of the present invention, the execution sequence of step S210 and step S220 is not limited, and step S210 may be executed first, and then step S220 may be executed; step S220 may be performed first, and then step S210 may be performed.
In step S230, the first list and the second list are updated according to the acquired application characteristics of the web server. The method comprises the following steps: generating a file name of a compressed sensitive file corresponding to the application characteristic, and adding the file name to the first list; and generating a file name and a regular expression of the non-compression sensitive file corresponding to the application characteristic, and adding the file name and the regular expression into the second list in a correlation manner.
As previously mentioned, the application characteristics of the web server may include the domain name, programming language that may be used, CMS type, server type, etc. basic information. The updating of the first list and the second list using these application features will be described below. It should be noted that the embodiments of the present invention are not limited to these several application features, and for other application features, a person skilled in the art may update the first list or the second list according to a similar principle.
(1) If the application features are domain names, a plurality of derivative words including the domain names can be generated according to the domain names, each derivative word is spliced with a common compressed file suffix, and a plurality of file names are generated and added to the first list. In the embodiment of the present invention, the derivative of the domain name may be collected and stored in the computing device by those skilled in the art in advance, and the domain name derivative includes the domain name itself.
Com, use this domain name to generate some derived words, such as example _ database, example _ backup, etc., and then concatenate the commonly used compressed file suffixes (. rar,. zip,. tar.gz, etc.) to generate a list of sensitive files, as follows:
(2) if the application characteristics are programming languages, generating installation file names, test file names and debugging file names corresponding to the programming languages, and generating associated regular expressions according to sensitive information output by a sensitive file with the file names executed by a network server.
Assuming that the website programming language is PHP, the test file is a simple script written by an administrator or developer to test whether the PHP environment is operating normally after the website is built. The commonly used test file names are phpinfo. PHP, test. PHP, t.php, etc., the content output by the website executing these test files is often information of PHP environment or a simple string, so the output content is matched by a regular expression, and therefore the generated association relationship between the file name and the regular expression is as follows:
(3) if the application characteristics are CMS types, generating configuration file names corresponding to the CMS types, combining the configuration file names with common backup suffixes to generate a plurality of file names, and generating associated regular expressions according to sensitive information included in sensitive files with the file names. The CMS is a software system that can be used to manage, create, and update the content of a website.
Assuming that the CMS used by the website is WordPress, downloading WordPress, it can be seen that a configuration file of WordPress is WP-config.php, and the content has a character string of "define ('WP _ DEBUG'," which can be used as a matching basis for a regular expression.
(4) If the application characteristics are the server type, generating a configuration file name corresponding to the server type, combining the configuration file name with a common backup suffix to generate a plurality of file names, and generating a related regular expression according to sensitive information included in sensitive files with the file names.
Assuming that the type of the server used by the website is Tomcat, and Tomcat is downloaded, it can be seen that the configuration file of Tomcat is/WEB-INF/web.xml,/WEB-INF/server.xml, and the web.xml has a character string of "< WEB-app", which can be used as a matching basis for the regular expression. Using the common backup suffixes (, bak,. swp,. 1,% 20(copy), etc.) spliced by the configuration file names, the regular expressions are matched according to the character strings, and the association relationship between the generated file names and the regular expressions is as follows:
in step S240, for each file name in the first list and the second list, a request (e.g., http request) is sent to the web server one by one, and according to a response result (e.g., http response) of the web server, it is determined whether there is a sensitive file leakage in the web server.
And judging whether the Content of the Content-Type in the response header exists in a Content-Type header list commonly used by the compressed file or not according to a response result corresponding to the file name in the first list, if so, determining that the leakage of the sensitive file with the file name exists in the network server, and if not, determining that the leakage of the sensitive file with the file name does not exist in the network server. The Content-Type attribute specifies the HTTP Content Type of the request and response, defines the Type of the network file and the code of the web page, and determines what form and code the file receiver will read the file. The Content-Type header list commonly used for compressed files is as follows:
in the embodiment of the present invention, the example of using the Burp Suite tool to send the request is described. Burp Suite is an integrated platform for detecting web applications. It contains many tools and many interfaces designed for these tools, all sharing a powerful extensible framework that can process and display HTTP messages, persistence, authentication, proxies, logs, alerts.
As shown in fig. 3, when the sensitive file is localhost. rar (a sensitive file generated according to a domain name), the Content-Type of the response request header is application/x-rar-compressed, and it is indicated that localhost. rar is a sensitive file of the web site localhost in the Content-Type header list commonly used for compressed files.
As shown in FIG. 4, when the sensitive file is www.rar, the Content-Type of the response request header is text/html, and the Content-Type is not in the Content-Type header list commonly used in compressed files, which indicates that www.rar is not a sensitive file of the website localhost.
The inventor of the application finds that many existing websites adopt an application defense system (such as a waf system) to prevent intrusion, and for some specific types of http requests, whether corresponding resources exist in the websites or not, the 200OK status codes are returned. Therefore, compared with the prior art that whether the sensitive file leakage exists in the website is determined through the http response status code, the method and the device for detecting the sensitive file leakage can improve the accuracy of sensitive file leakage detection and avoid false reporting and missing reporting by judging through the Content-Type.
For the response result corresponding to the file name in the second list, judging whether the state code in the response header is a state code (for example, 20x, 30x) indicating that the network server successfully processes the request, if so, further matching the response content with the regular expression associated with the file name, and if matching is successful, determining that the network server leaks the sensitive file with the file name; if the status code in the response header indicates that the network server did not successfully process the request (e.g., 404), then it is determined that the network server does not have a leak for the sensitive file with the file name.
As shown in fig. 5, when the sensitive file is WP-config.php.bak, the status code of the response request is 200 and the regular expression can be matched to the response content according to "define ('WP _ DEBUG'"), which indicates that WP-config.php.bak is the sensitive file of the website localhost.
As shown in fig. 6, when the sensitive file is config.php.bak, the status code of the response request is 404, which indicates that the sensitive file is not a localhost sensitive file of the website.
According to the scheme for detecting the sensitive file leakage in the network server, the website is simply collected, the possibly existing sensitive file list is generated according to the collected information by using the specific rule, and the website is enumerated, so that whether the sensitive file leakage exists in the target website is detected, the range of enumerating the sensitive file is reduced to a certain extent, and the detection efficiency is improved. Furthermore, compared with the prior art that the judgment is directly carried out according to the http response state code, the detection accuracy can be improved, the missing report and the false report of the sensitive file can be reduced through the matching of the regular expression,
it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules or units or components of the devices in the examples disclosed herein may be arranged in a device as described in this embodiment or alternatively may be located in one or more devices different from the devices in this example. The modules in the foregoing examples may be combined into one module or may be further divided into multiple sub-modules.
Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the following claims, any of the claimed embodiments may be used in any combination.
The various techniques described herein may be implemented in connection with hardware or software or, alternatively, with a combination of both. Thus, the methods and apparatus of the present invention, or certain aspects or portions thereof, may take the form of program code (i.e., instructions) embodied in tangible media, such as floppy diskettes, CD-ROMs, hard drives, or any other machine-readable storage medium, wherein, when the program is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the invention.
In the case of program code execution on programmable computers, the computing device will generally include a processor, a storage medium readable by the processor (including volatile and non-volatile memory and/or storage elements), at least one input device, and at least one output device. Wherein the memory is configured to store program code; the processor is configured to perform the method of the present invention according to instructions in the program code stored in the memory.
By way of example, and not limitation, computer readable media may comprise computer storage media and communication media. Computer-readable media includes both computer storage media and communication media. Computer storage media store information such as computer readable instructions, data structures, program modules or other data. Communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. Combinations of any of the above are also included within the scope of computer readable media.
Furthermore, some of the described embodiments are described herein as a method or combination of method elements that can be performed by a processor of a computer system or by other means of performing the described functions. A processor having the necessary instructions for carrying out the method or method elements thus forms a means for carrying out the method or method elements. Further, the elements of the apparatus embodiments described herein are examples of the following apparatus: the apparatus is used to implement the functions performed by the elements for the purpose of carrying out the invention.
As used herein, unless otherwise specified the use of the ordinal adjectives "first", "second", "third", etc., to describe a common object, merely indicate that different instances of like objects are being referred to, and are not intended to imply that the objects so described must be in a given sequence, either temporally, spatially, in ranking, or in any other manner.
While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of this description, will appreciate that other embodiments can be devised which do not depart from the scope of the invention as described herein. Furthermore, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter. Accordingly, many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the appended claims. The present invention has been disclosed in an illustrative rather than a restrictive sense, and the scope of the present invention is defined by the appended claims.

Claims (13)

1. A method for detecting sensitive file leakage in a network server, executed in a computing device, and comprising the steps of:
carrying out application identification on the network server to acquire application characteristics of the network server;
creating a first list of compressed class sensitive files and a second list of uncompressed class sensitive files, wherein each data entry of the first list is a file name of a compressed class sensitive file; each data item of the second list is an incidence relation between a file name of a non-compression sensitive file and a regular expression, wherein sensitive information corresponding to the sensitive file with the file name can be matched according to the regular expression;
generating a file name of a compressed sensitive file corresponding to the application characteristic according to the application characteristic of the acquired network server, adding the file name into the first list, generating a file name and a regular expression of a non-compressed sensitive file corresponding to the application characteristic, and adding the file name and the regular expression into the second list in a correlation manner; and
and initiating requests to the network server one by one for each file name in the first list and the second list, and judging whether sensitive files are leaked from the network server according to a response result of the network server.
2. The method of claim 1, wherein the application characteristics of the web server include one or more of the following characteristics: domain name, programming language used, content management system type, and server type.
3. The method of claim 1, wherein the first list initially includes total station backup filenames, database backup filenames, and log backup filenames that are independent of application features of a network server; the second list initially includes database creation filenames, debug filenames, log filenames, and test filenames that are independent of application characteristics of the web server, as well as regular expressions associated with each filename.
4. The method of claim 1, wherein the sensitive information corresponding to the sensitive file comprises content included in the sensitive file and content output by a web server executing the sensitive file.
5. The method of claim 1, wherein the application characteristic comprises a domain name of a web server, and the step of generating a file name of a compressed class-sensitive file corresponding to the application characteristic comprises:
and generating a derivative word comprising the domain name according to the domain name, and splicing the derivative word with a common compressed file suffix to generate a plurality of file names.
6. The method of claim 1, wherein the application characteristic comprises a programming language used by a web server, and the step of generating a file name and a regular expression of the uncompressed class-sensitive file corresponding to the application characteristic comprises:
and generating an installation file name, a test file name and a debugging file name corresponding to the programming language, and generating a related regular expression according to sensitive information output by a sensitive file with the file names executed by a network server.
7. The method of claim 1, wherein the application characteristic comprises a content management system type of a network server, and the step of generating a file name and a regular expression of the uncompressed class-sensitive file corresponding to the application characteristic comprises:
generating a configuration file name corresponding to the type of the content management system, combining the configuration file name with a common backup suffix to generate a plurality of file names, and generating a related regular expression according to sensitive information included in sensitive files with the file names.
8. The method of claim 1, wherein the application characteristic comprises a server type of a network server, and the step of generating a file name and a regular expression of the uncompressed class-sensitive file corresponding to the application characteristic comprises:
generating a configuration file name corresponding to the type of the server, combining the configuration file name with a common backup suffix to generate a plurality of file names, and generating a related regular expression according to sensitive information included in sensitive files with the file names.
9. The method of claim 1, wherein the step of determining whether the sensitive file leakage exists in the network server according to the response result of the network server comprises:
and judging whether the Content of the Content-Type in the response header exists in a Content-Type header list commonly used by the compressed file or not according to a response result corresponding to the file name in the first list, and if so, determining that the leakage of the sensitive file with the file name exists in the network server.
10. The method of claim 1, wherein the step of determining whether the sensitive file leakage exists in the network server according to the response result of the network server comprises:
and judging whether the state code in the response header is a state code representing that the network server successfully processes the request or not according to a response result corresponding to the file name in the second list, if so, further matching the response content with the regular expression associated with the file name, and if matching is successful, determining that the network server leaks the sensitive file with the file name.
11. The method of claim 1, wherein the web application recognition tool is used for application recognition on the web server, the web application recognition tool being whatge, blindeplephant, or wap purifier.
12. A computing device, comprising:
one or more processors; and
a memory;
one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing any of the methods of claims 1-11.
13. A computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a computing device, cause the computing device to perform any of the methods of claims 1-11.
CN201710693457.9A 2017-08-14 2017-08-14 Method and computing device for detecting sensitive file leakage in network server Active CN107566349B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710693457.9A CN107566349B (en) 2017-08-14 2017-08-14 Method and computing device for detecting sensitive file leakage in network server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710693457.9A CN107566349B (en) 2017-08-14 2017-08-14 Method and computing device for detecting sensitive file leakage in network server

Publications (2)

Publication Number Publication Date
CN107566349A CN107566349A (en) 2018-01-09
CN107566349B true CN107566349B (en) 2019-12-24

Family

ID=60974069

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710693457.9A Active CN107566349B (en) 2017-08-14 2017-08-14 Method and computing device for detecting sensitive file leakage in network server

Country Status (1)

Country Link
CN (1) CN107566349B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110147675B (en) * 2019-05-22 2021-05-28 杭州安恒信息技术股份有限公司 Safety detection method and equipment for intelligent terminal

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103095530A (en) * 2013-01-21 2013-05-08 中国科学院信息工程研究所 Method and system for sensitive information monitoring and leakage prevention based on front-end gateway
CN104318162A (en) * 2014-09-27 2015-01-28 深信服网络科技(深圳)有限公司 Source code leakage detection method and device
CN106657151A (en) * 2017-02-06 2017-05-10 杭州迪普科技股份有限公司 Website information leakage protection method, apparatus and device
CN106815527A (en) * 2016-12-01 2017-06-09 全球能源互联网研究院 The detection method and device of a kind of IOS application datas safety

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100985857B1 (en) * 2007-12-24 2010-10-08 한국전자통신연구원 Device and method for detecting and preventing sensitive information leakage in portable terminal

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103095530A (en) * 2013-01-21 2013-05-08 中国科学院信息工程研究所 Method and system for sensitive information monitoring and leakage prevention based on front-end gateway
CN104318162A (en) * 2014-09-27 2015-01-28 深信服网络科技(深圳)有限公司 Source code leakage detection method and device
CN106815527A (en) * 2016-12-01 2017-06-09 全球能源互联网研究院 The detection method and device of a kind of IOS application datas safety
CN106657151A (en) * 2017-02-06 2017-05-10 杭州迪普科技股份有限公司 Website information leakage protection method, apparatus and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于动态污点跟踪的敏感文件泄露检测方法;李伟明等;《华中科技大圩学报(自然科学版)》;20161116;全文 *

Also Published As

Publication number Publication date
CN107566349A (en) 2018-01-09

Similar Documents

Publication Publication Date Title
US11188650B2 (en) Detection of malware using feature hashing
US10152594B2 (en) Method and device for identifying virus APK
US20200143051A1 (en) Security scanning method and apparatus for mini program, and electronic device
US7640587B2 (en) Source code repair method for malicious code detection
RU2637477C1 (en) System and method for detecting phishing web pages
WO2016201819A1 (en) Method and apparatus for detecting malicious file
US9213837B2 (en) System and method for detecting malware in documents
US8863282B2 (en) Detecting and responding to malware using link files
CN107688538B (en) Script execution method and device and computing equipment
CN111563024B (en) Method and device for monitoring container process on host machine and computing equipment
WO2021081139A1 (en) Intelligent signature-based anti-cloaking web recrawling
US8307276B2 (en) Distributed content verification and indexing
US9106688B2 (en) System, method and computer program product for sending information extracted from a potentially unwanted data sample to generate a signature
CN107566392B (en) Detection method for error reporting type SQL injection, proxy server and storage medium
US10243977B1 (en) Automatically detecting a malicious file using name mangling strings
CN115562992A (en) File detection method and device, electronic equipment and storage medium
CN108898014B (en) Virus checking and killing method, server and electronic equipment
WO2019013266A1 (en) Determination device, determination method, and determination program
CN112379888A (en) Code change analysis method
CN107566349B (en) Method and computing device for detecting sensitive file leakage in network server
US11550920B2 (en) Determination apparatus, determination method, and determination program
CN111966630B (en) File type detection method, device, equipment and medium
CN114626061A (en) Webpage Trojan horse detection method and device, electronic equipment and medium
CN108173716B (en) Method for identifying network equipment manufacturer and computing equipment
CN106372508B (en) Malicious document processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: Room 311501, Unit 1, Building 5, Courtyard 1, Futong East Street, Chaoyang District, Beijing 100102

Applicant after: Beijing Zhichuangyu Information Technology Co., Ltd.

Address before: 100097 Jinwei Building 803, 55 Lanindichang South Road, Haidian District, Beijing

Applicant before: Beijing Knows Chuangyu Information Technology Co.,Ltd.

GR01 Patent grant
GR01 Patent grant