CN111723400A - JS sensitive information leakage detection method, device, equipment and medium - Google Patents

JS sensitive information leakage detection method, device, equipment and medium Download PDF

Info

Publication number
CN111723400A
CN111723400A CN202010548330.XA CN202010548330A CN111723400A CN 111723400 A CN111723400 A CN 111723400A CN 202010548330 A CN202010548330 A CN 202010548330A CN 111723400 A CN111723400 A CN 111723400A
Authority
CN
China
Prior art keywords
file
sensitive information
url
rule base
information leakage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010548330.XA
Other languages
Chinese (zh)
Inventor
廖喜君
范渊
黄进
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
DBAPPSecurity Co Ltd
Hangzhou Dbappsecurity Technology Co Ltd
Original Assignee
Hangzhou Dbappsecurity Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dbappsecurity Technology Co Ltd filed Critical Hangzhou Dbappsecurity Technology Co Ltd
Priority to CN202010548330.XA priority Critical patent/CN111723400A/en
Publication of CN111723400A publication Critical patent/CN111723400A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6209Protecting access to data via a platform, e.g. using keys or access control rules to a single file or object, e.g. in a secure envelope, encrypted and accessed using a key, or with access control rules appended to the object itself
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • G06F16/9566URL specific, e.g. using aliases, detecting broken or misspelled links
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/04Processing captured monitoring data, e.g. for logfile generation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/06Generation of reports

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The application discloses a JS sensitive information leakage detection method, a JS sensitive information leakage detection device, JS sensitive information leakage detection equipment and a JS sensitive information leakage detection medium, which comprise the following steps: acquiring a URL to be detected of a target website; the URL to be detected is accessed based on a crawler technology, and a corresponding first JS file is obtained; performing JS file scanning on the target website by using a file dictionary to obtain a corresponding second JS file; matching the file contents of the first JS file and the second JS file by using a preset rule base to obtain corresponding first sensitive information; if the first sensitive information is a sensitive URL, sending a request to the sensitive URL to obtain corresponding response data; and matching the response data by using the preset rule base to obtain corresponding second sensitive information. Therefore, the detection efficiency of JS sensitive information leakage and the detection comprehensiveness can be improved.

Description

JS sensitive information leakage detection method, device, equipment and medium
Technical Field
The application relates to the technical field of network security, in particular to a JS sensitive information leakage detection method, device, equipment and medium.
Background
JavaScript, as a fairly simple but powerful client-side scripting language, is inherently an interpreted language. Therefore, the execution principle is to operate while interpreting. The above characteristics determine that the JavaScript is different from some server scripting languages (such as ASP, PHP) and compiled languages (such as C, C + +), and the source code thereof can be easily obtained by anyone. In the process of website development, developers often write some sensitive information such as account number and password, cookie, api key and the like into a JS (JavaScript) file for debugging, if the sensitive information is not cleaned in time when a program is on line, and the JS file is not confused for encryption, due to the characteristics of JS, an attacker can easily make a summary of the information, so that different threats are caused to WEB services and user privacy.
At present, the conventional detection method for JS sensitive information leakage generally includes that after a package grabbing tool is used to browse a test site and a JS file is acquired, whether keywords related to sensitive leakage are contained in JS file content or whether information conforming to a format exists is searched in a regular manner. However, the detection efficiency is low, and the detection content is not comprehensive, so that the report is missed.
Disclosure of Invention
In view of this, an object of the present application is to provide a JS sensitive information leakage detection method, apparatus, device, and medium, which can improve detection efficiency of JS sensitive information leakage and detection comprehensiveness. The specific scheme is as follows:
in a first aspect, the application discloses a JS sensitive information leakage detection method, which includes:
acquiring a URL to be detected of a target website;
the URL to be detected is accessed based on a crawler technology, and a corresponding first JS file is obtained;
performing JS file scanning on the target website by using a file dictionary to obtain a corresponding second JS file;
matching the file contents of the first JS file and the second JS file by using a preset rule base to obtain corresponding first sensitive information;
if the first sensitive information is a sensitive URL, sending a request to the sensitive URL to obtain corresponding response data;
and matching the response data by using the preset rule base to obtain corresponding second sensitive information.
Optionally, based on crawler technology visit wait to detect the URL, obtain corresponding first JS file, include:
and crawling a website page corresponding to the URL to be detected in an asynchronous crawler mode to obtain the corresponding first JS file.
Optionally, the JS sensitive information leakage detection method further includes:
and when all the JS files of the website page corresponding to the URL to be detected which are crawled have the same file in the already-crawled JS files, stopping the crawler.
Optionally, the rule base is preset by the utilization, and before the first JS file and the second JS file are matched, the method further includes:
and filtering the first JS file and the second JS file by using the names of the JS files.
Optionally, the file dictionary that is utilized is right the JS file scanning is carried out on the target website, and before obtaining the corresponding second JS file, the method further includes:
and removing the JS file name corresponding to the first JS file in the file dictionary.
Optionally, it is right to utilize the rule base of predetermineeing the first JS file with the second JS file matches, include:
matching the first JS file with the second JS file by using a regular expression in the preset rule base;
optionally, the matching the response data by using the preset rule base includes:
and matching the response data by using the regular expression in the preset rule base.
Optionally, the JS sensitive information leakage detection method further includes:
classifying the first sensitive information and the second sensitive information by utilizing the preset rule base;
generating a corresponding detection report; the detection report comprises the JS file matched with the preset rule base, the first sensitive information, the second sensitive information and the sensitive information type.
In a second aspect, the application discloses a JS sensitive information leakage detection device, including:
the website URL acquisition module is used for acquiring a URL to be detected of a target website;
the JS file crawling module is used for visiting the URL to be detected based on a crawler technology to obtain a corresponding first JS file;
the file dictionary scanning module is used for scanning the JS files of the target website by using the file dictionary to obtain corresponding second JS files;
the JS file matching module is used for matching the file contents of the first JS file and the second JS file by using a preset rule base to obtain corresponding first sensitive information;
the response data acquisition module is used for sending a request to the sensitive URL to obtain corresponding response data if the first sensitive information is the sensitive URL;
and the response data matching module is used for matching the response data by utilizing the preset rule base to obtain corresponding second sensitive information.
In a third aspect, the application discloses a JS sensitive information leakage detection device, which comprises a processor and a memory; wherein the content of the first and second substances,
the memory is used for storing a computer program;
the processor is used for executing the computer program to realize the JS-sensitive information leakage detection method.
In a fourth aspect, the present application discloses a computer-readable storage medium for storing a computer program, wherein the computer program, when executed by a processor, implements the aforementioned JS-sensitive information leakage detection method.
It is thus clear that this application acquires the URL that waits of target website earlier, then visits based on crawler technology the URL that waits obtains corresponding first JS file to and it is right to utilize the file dictionary the target website carries out JS file scanning, obtains corresponding second JS file, later utilizes and predetermines the rule base right first JS file with the file content of second JS file matches, obtains corresponding first sensitive information, if first sensitive information is sensitive URL, then right sensitive URL sends the request to obtain corresponding response data, utilizes at last predetermine the rule base right response data matches, obtains corresponding second sensitive information. Therefore, a comprehensive JS file is obtained based on crawler technology and file dictionary scanning, after the obtained JS file content is matched, a request response is sent to the matched sensitive URL, then response data is matched, and the detection efficiency of JS sensitive information leakage and the detection comprehensiveness can be improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a flowchart of a JS sensitive information leakage detection method disclosed in the present application;
fig. 2 is a flowchart of a specific JS sensitive information leakage detection method disclosed in the present application;
fig. 3 is a flowchart of a specific JS sensitive information leakage detection method disclosed in the present application;
fig. 4 is a schematic structural view of a JS sensitive information leakage detection device disclosed in the present application;
fig. 5 is a structural diagram of a JS-sensitive information leakage detection apparatus disclosed in the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
At present, a traditional detection method for JS sensitive information leakage generally includes that after a package grabbing tool is used for browsing a test site and a JS file is acquired, whether a keyword related to sensitive leakage is contained in JS file content is searched for. However, the detection efficiency is low, and the detection content is not comprehensive, so that the report is missed. Therefore, the JS sensitive information leakage detection scheme is provided, and the detection efficiency and the detection comprehensiveness of the JS sensitive information leakage can be improved.
Referring to fig. 1, an embodiment of the application discloses a JS sensitive information leakage detection method, including:
step S11: and acquiring a URL (uniform resource locator) to be detected of the target website.
In a specific implementation manner, the obtained URLs to be detected may be mass-imported URLs or URLs which are input one by one.
That is, in this embodiment, the URL to be detected may be imported in batch by the terminal device, or the URL to be detected may be entered singly, and the corresponding detection task is issued after the URL is imported.
Step S12: and accessing the URL to be detected based on a crawler technology to obtain a corresponding first JS file.
In a specific implementation manner, in this embodiment, the website page corresponding to the URL to be detected can be crawled in an asynchronous crawler manner, so as to obtain the corresponding first JS file.
And when all the crawled JS files of the website page corresponding to the URL to be detected have the same file in the crawled JS files, stopping the crawler.
It should be noted that, this embodiment first uses the mode of crawler to obtain JS connection file, visits each page through the crawler, obtains the JS file therefrom, and in order to improve the speed of crawler, this embodiment uses the mode of asynchronous crawler. Because the JS files introduced in a plurality of places in the website are all the same, in order to improve the efficiency, when a certain page is crawled, the obtained JS files are all obtained before, and the crawler is immediately stopped.
Step S13: and scanning the JS files of the target website by using the file dictionary to obtain a corresponding second JS file.
In a specific embodiment, since the JS files generally all exist in one web directory folder, the embodiment may scan the directory folder of the target website by using the file dictionary to find the corresponding JS folder, and then perform file dictionary scan on the directory content in the JS folder.
In addition, in order to improve the detection speed, the JS file name corresponding to the first JS file can be removed from the file dictionary before the JS file is scanned.
It should be noted that, because some JS files do not appear in the response of the crawler, in order to acquire as many JS files to be detected as possible, in this embodiment, a dictionary containing a common JS file name may be used to perform fuzz on the JS directory of the website, and if the http status code 200 is returned, it indicates that the corresponding JS file exists. In order to improve efficiency, the JS file names acquired by the crawler can be excluded from the dictionary firstly, so that part of package sending requests can be reduced, then the website path of the JS acquired by the crawler is spliced with the JS file names in the dictionary, and then fuzz is carried out. In this way, invalid requests with a status code of 404 are reduced.
Step S14: and matching the file contents of the first JS file and the second JS file by using a preset rule base to obtain corresponding first sensitive information.
In a specific implementation manner, the embodiment may utilize the regular expression in the preset rule base to match the first JS file with the second JS file.
In addition, in this embodiment, the file contents of the first JS file and the second JS file may be obtained first, for the obtained JS files, the content of each JS file may be obtained by sending an http GET request, the body of response is the content of the JS file, and in order to prevent code confusion, the embodiment analyzes the corresponding JS content according to the code corresponding to the website.
Further, the present embodiment may classify the first sensitive information and the second sensitive information by using the preset rule base.
Specifically, in this embodiment, an http request is sent to the URL of the JS file to be tested, and corresponding response content is acquired, and then rule base matching and classification are performed on the response content.
The preset rule base comprises regular expressions of common types of sensitive information and the types of the sensitive information, such as url, mailbox, token or password leakage, file path, intranet ip, cloud leakage, mobile phone number, domain name, identity card number, user name password and the like.
That is, sensitive information such as a mailbox, a telephone number, an identity card number, an intranet IP, a user name password, and the like can be collected in advance, and the sensitive information is represented by a regular expression, so that a rule base is formed. And then, rapidly matching the obtained JS file content by using a regular expression in a rule base.
For example: matching a mobile phone number in the content of the JS file, extracting information satisfying a regular expression (.
Step S15: and if the first sensitive information is a sensitive URL, sending a request to the sensitive URL to obtain corresponding response data.
Step S16: and matching the response data by using the preset rule base to obtain corresponding second sensitive information.
That is, if the URL is matched, the request packet is sent to the URL, and then the rule matching is performed on the response packet. And if the matching is successful, extracting the matching information, and classifying according to the rule type.
In a specific implementation manner, the response data is matched by using a regular expression in the preset rule base.
For example: and if the content of the JS file is matched with url http:// www.xxxx.com/api. php, the api is packaged, and matching is carried out according to the response content. For example, matching to intranet IP address 192.168.1.1, and marking the result as intranet IP according to the rule type.
It should be noted that some JS files that exist in the WEB server and are not used any more due to version update or JS files that are backed up do not appear in the url list of the bale plucking tool, and if sensitive information is contained in the JS files, the JS files cannot be detected. In addition, many times, sensitive information often appears in the request response of AJAX (Asynchronous JavaScript And XML) in the JS file, And if the url is not requested, the sensitive information leakage problem of the part is missed.
It can be seen that, this application embodiment acquires the URL that waits of target website earlier, then visits based on crawler technology the URL that waits obtains corresponding first JS file to and utilize the file dictionary right the target website carries out JS file scanning, obtains corresponding second JS file, later utilize and predetermine the rule base right first JS file with the file content of second JS file matches, obtains corresponding first sensitive information, if first sensitive information is sensitive URL, then right sensitive URL sends the request to obtain corresponding response data, utilizes at last predetermine the rule base right response data matches, obtains corresponding second sensitive information. Therefore, a comprehensive JS file is obtained based on crawler technology and file dictionary scanning, after the obtained JS file content is matched, a request response is sent to the matched sensitive URL, then response data is matched, and the detection efficiency of JS sensitive information leakage and the detection comprehensiveness can be improved.
Referring to fig. 2, the embodiment of the application discloses a specific JS sensitive information leakage detection method, which includes:
step S21: and acquiring the URL to be detected of the target website.
Step S22: and accessing the URL to be detected based on a crawler technology to obtain a corresponding first JS file.
Step S23: and scanning the JS files of the target website by using the file dictionary to obtain a corresponding second JS file.
Step S24: and filtering the first JS file and the second JS file by using the names of the JS files.
In a specific implementation mode, the embodiment can utilize the JS file name to be right the first JS file and the second JS file perform deduplication operation, and then utilize the JS file name to perform JS file filtering so as to filter out the JS file which does not need to be detected.
That is, the JS file acquired by the crawler and the dictionary scanning is first deduplicated. Further, because some JS files are JS files of the third-party security component, the JS files finally obtained through the above process reveal the JS file url to be tested as sensitive information. Because the JS file of the third-party component often does not have the sensitive information of the user to find, the detection efficiency is low because the JS file is too many files to detect if the JS file is not filtered out. Js files can be filtered out according to the characteristics of file names, such as: jquery and the like, sensitive information stored by a developer does not exist in the JS file, and the JS file is filtered by using a white list in order to improve the detection efficiency. After filtering is completed, the finally obtained JS file is the JS file url to be detected for sensitive information leakage.
Step S25: and matching the file contents of the first JS file and the second JS file after filtering by utilizing a preset rule base to obtain corresponding first sensitive information.
Step S26: and if the first sensitive information is a sensitive URL, sending a request to the sensitive URL to obtain corresponding response data.
Step S27: and matching the response data by using the preset rule base to obtain corresponding second sensitive information.
Step S28: generating a corresponding detection report; the detection report comprises the JS file matched with the preset rule base, the first sensitive information, the second sensitive information and the sensitive information type.
That is, the present embodiment can generate a detection report of JS sensitive information leakage, where the content includes url of the requested JS, sensitive information leakage content, and type, and output the report to the Word report.
For example, referring to fig. 3, an embodiment of the present application discloses a flowchart of a specific JS-sensitive information leakage detection method.
That is, the method and the system can automatically detect and verify the JS sensitive information leakage, improve the accuracy and the working efficiency of detecting the JS sensitive information leakage leak, help website administrators and operation and maintenance personnel to find out the sensitive information leakage problem in the JS and correct the sensitive information leakage problem in time, so as to prevent attackers from utilizing the sensitive information.
Referring to fig. 4, an embodiment of the present application discloses a JS sensitive information leakage detection device, including:
a website URL obtaining module 11, configured to obtain a to-be-detected URL of a target website;
the JS file crawling module 12 is used for accessing the URL to be detected based on a crawler technology to obtain a corresponding first JS file;
the file dictionary scanning module 13 is configured to perform JS file scanning on the target website by using a file dictionary to obtain a corresponding second JS file;
the JS file matching module 14 is configured to match file contents of the first JS file and the second JS file by using a preset rule base, so as to obtain corresponding first sensitive information;
a response data obtaining module 15, configured to send a request to a sensitive URL if the first sensitive information is the sensitive URL, so as to obtain corresponding response data;
and the response data matching module 16 is configured to match the response data by using the preset rule base to obtain corresponding second sensitive information.
It can be seen that, this application embodiment acquires the URL that waits of target website earlier, then visits based on crawler technology the URL that waits obtains corresponding first JS file to and utilize the file dictionary right the target website carries out JS file scanning, obtains corresponding second JS file, later utilize and predetermine the rule base right first JS file with the file content of second JS file matches, obtains corresponding first sensitive information, if first sensitive information is sensitive URL, then right sensitive URL sends the request to obtain corresponding response data, utilizes at last predetermine the rule base right response data matches, obtains corresponding second sensitive information. Therefore, a comprehensive JS file is obtained based on crawler technology and file dictionary scanning, after the obtained JS file content is matched, a request response is sent to the matched sensitive URL, then response data is matched, and the detection efficiency of JS sensitive information leakage and the detection comprehensiveness can be improved.
The JS file crawling module 12 is specifically configured to crawl a website page corresponding to the URL to be detected in an asynchronous crawler manner to obtain the corresponding first JS file.
The JS sensitive information leakage detection device further comprises a crawler stopping control module, and the crawler stopping control module is used for stopping the crawler when the same files exist in all the JS files of the website page corresponding to the URL to be detected in the crawling process.
The JS sensitive information leakage detection device further comprises a JS file filtering module, and the first JS file and the second JS file are filtered by using the JS file name.
The JS sensitive information leakage detection device further comprises a dictionary file name removal module, and the JS file names corresponding to the first JS files are removed from the file dictionary.
The JS file matching module 14 is specifically configured to match the first JS file with the second JS file by using the regular expression in the preset rule base;
the response data matching module 16 is specifically configured to match the response data by using a regular expression in the preset rule base.
The JS sensitive information leakage detection device also comprises a sensitive information classification module which classifies the first sensitive information and the second sensitive information by utilizing the preset rule base;
the JS sensitive information leakage detection device also comprises a detection report generation module for generating a corresponding detection report; the detection report comprises the JS file matched with the preset rule base, the first sensitive information, the second sensitive information and the sensitive information type.
Referring to fig. 5, the embodiment of the present application discloses a JS-sensitive information leakage detection device, which includes a processor 21 and a memory 22; wherein, the memory 22 is used for saving computer programs; the processor 21 is configured to execute the computer program to implement the JS sensitive information leakage detection method disclosed in the foregoing embodiment.
For a specific process of the JS sensitive information leakage detection method, reference may be made to corresponding contents disclosed in the foregoing embodiments, and details are not described here.
Further, an embodiment of the present application further discloses a computer-readable storage medium for storing a computer program, where the computer program is executed by a processor to implement the JS sensitive information leakage detection method disclosed in the foregoing embodiment.
For a specific process of the JS sensitive information leakage detection method, reference may be made to corresponding contents disclosed in the foregoing embodiments, and details are not described here.
The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The method, the device, the equipment and the medium for detecting the JS sensitive information leakage provided by the application are described in detail, a specific example is applied in the text to explain the principle and the implementation mode of the application, and the description of the embodiment is only used for helping to understand the method and the core idea of the application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (10)

1. A JS sensitive information leakage detection method is characterized by comprising the following steps:
acquiring a URL to be detected of a target website;
the URL to be detected is accessed based on a crawler technology, and a corresponding first JS file is obtained;
performing JS file scanning on the target website by using a file dictionary to obtain a corresponding second JS file;
matching the file contents of the first JS file and the second JS file by using a preset rule base to obtain corresponding first sensitive information;
if the first sensitive information is a sensitive URL, sending a request to the sensitive URL to obtain corresponding response data;
and matching the response data by using the preset rule base to obtain corresponding second sensitive information.
2. The JS sensitive information leakage detection method as claimed in claim 1, wherein the accessing the URL to be detected based on crawler technology to obtain a corresponding first JS file comprises:
and crawling a website page corresponding to the URL to be detected in an asynchronous crawler mode to obtain the corresponding first JS file.
3. The JS-sensitive information leakage detection method according to claim 1, further comprising:
and when all the JS files of the website page corresponding to the URL to be detected which are crawled have the same file in the already-crawled JS files, stopping the crawler.
4. The JS-sensitive information leakage detection method according to claim 1, wherein before the matching of the first JS file and the second JS file is performed by using the preset rule base, the method further includes:
and filtering the first JS file and the second JS file by using the names of the JS files.
5. The JS-sensitive information leakage detection method according to claim 1, wherein the JS file scanning is performed on the target website by using the file dictionary, and before the corresponding second JS file is obtained, the method further includes:
and removing the JS file name corresponding to the first JS file in the file dictionary.
6. The JS-sensitive information leakage detection method as recited in claim 1,
utilize and preset the rule base right first JS file with the second JS file matches, include:
matching the first JS file with the second JS file by using a regular expression in the preset rule base;
the matching the response data by using the preset rule base comprises:
and matching the response data by using the regular expression in the preset rule base.
7. The JS-sensitive information leakage detection method according to any one of claims 1 to 6, characterized by further comprising:
classifying the first sensitive information and the second sensitive information by utilizing the preset rule base;
generating a corresponding detection report; the detection report comprises the JS file matched with the preset rule base, the first sensitive information, the second sensitive information and the sensitive information type.
8. The utility model provides a JS sensitive information leakage detection device which is characterized by comprising:
the website URL acquisition module is used for acquiring a URL to be detected of a target website;
the JS file crawling module is used for visiting the URL to be detected based on a crawler technology to obtain a corresponding first JS file;
the file dictionary scanning module is used for scanning the JS files of the target website by using the file dictionary to obtain corresponding second JS files;
the JS file matching module is used for matching the file contents of the first JS file and the second JS file by using a preset rule base to obtain corresponding first sensitive information;
the response data acquisition module is used for sending a request to the sensitive URL to obtain corresponding response data if the first sensitive information is the sensitive URL;
and the response data matching module is used for matching the response data by utilizing the preset rule base to obtain corresponding second sensitive information.
9. The JS-sensitive information leakage detection device is characterized by comprising a processor and a memory; wherein the content of the first and second substances,
the memory is used for storing a computer program;
the processor configured to execute the computer program to implement the JS-sensitive information leakage detecting method according to any one of claims 1 to 7.
10. A computer-readable storage medium characterized by holding a computer program, wherein the computer program when executed by a processor implements the JS-sensitive information leakage detecting method according to any one of claims 1 to 7.
CN202010548330.XA 2020-06-16 2020-06-16 JS sensitive information leakage detection method, device, equipment and medium Pending CN111723400A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010548330.XA CN111723400A (en) 2020-06-16 2020-06-16 JS sensitive information leakage detection method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010548330.XA CN111723400A (en) 2020-06-16 2020-06-16 JS sensitive information leakage detection method, device, equipment and medium

Publications (1)

Publication Number Publication Date
CN111723400A true CN111723400A (en) 2020-09-29

Family

ID=72566924

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010548330.XA Pending CN111723400A (en) 2020-06-16 2020-06-16 JS sensitive information leakage detection method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN111723400A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112347331A (en) * 2020-11-11 2021-02-09 福建有度网络安全技术有限公司 JS sensitive information leakage detection method, device, equipment and medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101510195A (en) * 2008-02-15 2009-08-19 刘峰 Website safety protection and test diagnosis system structure method based on crawler technology
US8752183B1 (en) * 2012-07-10 2014-06-10 Hoyt Technologies, Inc. Systems and methods for client-side vulnerability scanning and detection
CN104683328A (en) * 2015-01-29 2015-06-03 兴华永恒(北京)科技有限责任公司 Method and system for scanning cross-site vulnerability
CN105678170A (en) * 2016-01-05 2016-06-15 广东工业大学 Method for dynamically detecting cross site scripting (XSS) bugs
CN107579976A (en) * 2017-09-06 2018-01-12 杭州安恒信息技术有限公司 The method and device of self-defined detection website sensitive information
CN107908959A (en) * 2017-11-10 2018-04-13 北京知道创宇信息技术有限公司 Site information detection method, device, electronic equipment and storage medium
CN108667766A (en) * 2017-03-28 2018-10-16 腾讯科技(深圳)有限公司 File detection method and file detection device
CN109672658A (en) * 2018-09-25 2019-04-23 平安科技(深圳)有限公司 Detection method, device, equipment and the storage medium of JSON abduction loophole
US20200125729A1 (en) * 2016-07-10 2020-04-23 Cyberint Technologies Ltd. Online assets continuous monitoring and protection

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101510195A (en) * 2008-02-15 2009-08-19 刘峰 Website safety protection and test diagnosis system structure method based on crawler technology
US8752183B1 (en) * 2012-07-10 2014-06-10 Hoyt Technologies, Inc. Systems and methods for client-side vulnerability scanning and detection
CN104683328A (en) * 2015-01-29 2015-06-03 兴华永恒(北京)科技有限责任公司 Method and system for scanning cross-site vulnerability
CN105678170A (en) * 2016-01-05 2016-06-15 广东工业大学 Method for dynamically detecting cross site scripting (XSS) bugs
US20200125729A1 (en) * 2016-07-10 2020-04-23 Cyberint Technologies Ltd. Online assets continuous monitoring and protection
CN108667766A (en) * 2017-03-28 2018-10-16 腾讯科技(深圳)有限公司 File detection method and file detection device
CN107579976A (en) * 2017-09-06 2018-01-12 杭州安恒信息技术有限公司 The method and device of self-defined detection website sensitive information
CN107908959A (en) * 2017-11-10 2018-04-13 北京知道创宇信息技术有限公司 Site information detection method, device, electronic equipment and storage medium
CN109672658A (en) * 2018-09-25 2019-04-23 平安科技(深圳)有限公司 Detection method, device, equipment and the storage medium of JSON abduction loophole

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
冶运涛 等: "虚拟流域环境理论技术研究与应用", 大连海事大学出版社, pages: 243 - 248 *
翟涵: "基于网络爬虫的Web安全扫描工具的设计与实现", 《 中国优秀硕士学位论文全文数据库信息科技辑》, no. 11 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112347331A (en) * 2020-11-11 2021-02-09 福建有度网络安全技术有限公司 JS sensitive information leakage detection method, device, equipment and medium

Similar Documents

Publication Publication Date Title
US9614862B2 (en) System and method for webpage analysis
US9954886B2 (en) Method and apparatus for detecting website security
CN103888490B (en) A kind of man-machine knowledge method for distinguishing of full automatic WEB client side
TWI515588B (en) Machine behavior determination method, web browser and web server
CN109768992B (en) Webpage malicious scanning processing method and device, terminal device and readable storage medium
CN107341395B (en) Method for intercepting reptiles
JP2020515944A (en) System and method for direct in-browser markup of elements in Internet content
US9792370B2 (en) Identifying equivalent links on a page
CN111008405A (en) Website fingerprint identification method based on file Hash
CN113518077A (en) Malicious web crawler detection method, device, equipment and storage medium
CN114528457A (en) Web fingerprint detection method and related equipment
CN113810381A (en) Crawler detection method, web application cloud firewall, device and storage medium
CN114003794A (en) Asset collection method, device, electronic equipment and medium
CN111224923A (en) Detection method, device and system for counterfeit websites
CN114157568B (en) Browser secure access method, device, equipment and storage medium
CN110619075A (en) Webpage identification method and equipment
US11023590B2 (en) Security testing tool using crowd-sourced data
CN111723400A (en) JS sensitive information leakage detection method, device, equipment and medium
CN107786529B (en) Website detection method, device and system
CN111125704B (en) Webpage Trojan horse recognition method and system
CN110457900B (en) Website monitoring method, device and equipment and readable storage medium
CN105243134B (en) A kind of method and apparatus handling browser of being held as a hostage
US11556819B2 (en) Collection apparatus, collection method, and collection program
CN109246069B (en) Webpage login method and device and readable storage medium
CN116451271A (en) Automatic privacy policy extraction method for application software

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200929