CN107908959B - Website information detection method and device, electronic equipment and storage medium - Google Patents

Website information detection method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN107908959B
CN107908959B CN201711107083.4A CN201711107083A CN107908959B CN 107908959 B CN107908959 B CN 107908959B CN 201711107083 A CN201711107083 A CN 201711107083A CN 107908959 B CN107908959 B CN 107908959B
Authority
CN
China
Prior art keywords
network address
sensitive information
access result
preset
target network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711107083.4A
Other languages
Chinese (zh)
Other versions
CN107908959A (en
Inventor
陈诚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Knownsec Information Technology Co Ltd
Original Assignee
Beijing Knownsec Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Knownsec Information Technology Co Ltd filed Critical Beijing Knownsec Information Technology Co Ltd
Priority to CN201711107083.4A priority Critical patent/CN107908959B/en
Publication of CN107908959A publication Critical patent/CN107908959A/en
Application granted granted Critical
Publication of CN107908959B publication Critical patent/CN107908959B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/556Detecting local intrusion or implementing counter-measures involving covert channels, i.e. data leakage between processes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/03Indexing scheme relating to G06F21/50, monitoring users, programs or devices to maintain the integrity of platforms
    • G06F2221/033Test or assess software

Abstract

The invention provides a website information detection method and device, electronic equipment and a storage medium, and relates to the technical field of computers. The website information detection method comprises the following steps: accessing a target network address to be detected by using a preset web application testing tool, wherein a driving engine of the web application testing tool is a preset browser engine; judging whether a first access result returned by a server of a target website corresponding to the target network address is acquired; if so, judging whether preset sensitive information exists in the first access result; and if so, judging that the sensitive information of the target website is leaked. The website information detection method can comprehensively detect whether sensitive information is leaked from the website.

Description

Website information detection method and device, electronic equipment and storage medium
Technical Field
The invention relates to the technical field of computers, in particular to a website information detection method and device, electronic equipment and a storage medium.
Background
At present, webpage data of a plurality of websites are increasingly huge, links are increasingly more, and maintenance of the data of the websites becomes very important. Detecting sensitive information leakage of a website is an important ring in data maintenance of the website.
The existing method for detecting website sensitive information leakage is to traverse links in a crawler mode to obtain a returned result, and then judge whether sensitive information exists in the returned result through a customized sensitive information feature library. However, most websites are of a front-end and back-end separated architecture, that is, html is rendered by adopting a front-end JS template, and back-end data is asynchronously obtained, but a crawler does not have a function of executing JS, so that html information rendered by the front-end JS template and asynchronous request links cannot be captured, and all return results of website information cannot be obtained. That is, the returned result for detection is incomplete, which results in inaccurate detection of sensitive information leakage of the website.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method and an apparatus for detecting website information, an electronic device, and a storage medium, so as to solve the problem in the prior art that a returned result for detection is incomplete, which results in inaccurate detection of sensitive information leakage of a website.
In order to achieve the purpose, the technical scheme adopted by the invention is as follows:
in a first aspect, an embodiment of the present invention provides a method for detecting website information, where the method includes: accessing a target network address to be detected by using a preset web application testing tool, wherein a driving engine of the web application testing tool is a preset browser engine; judging whether a first access result returned by a server of a target website corresponding to the target network address is acquired; if so, judging whether preset sensitive information exists in the first access result; and if so, judging that the sensitive information of the target website is leaked.
In a second aspect, an embodiment of the present invention provides a website information detection apparatus, where the apparatus includes a simulation access module, a first determination module, a second determination module, and a first execution module, where the simulation access module is configured to access a target network address to be detected by using a preset web application test tool, where a driving engine of the web application test tool is a preset browser engine; the first judging module is used for judging whether a first access result returned by the server of the target website corresponding to the target network address is acquired; the second judging module is used for judging whether preset sensitive information exists in a first access result when the first access result returned by the server of the target website corresponding to the target network address is obtained; the first execution module is used for judging that the sensitive information of the target website is leaked when the preset sensitive information exists in the first access result.
In a third aspect, an embodiment of the present invention provides an electronic device, which includes a memory and a processor, where the memory stores computer instructions, and when the computer instructions are read and executed by the processor, the processor is caused to execute the method provided in the first aspect.
In a fourth aspect, an embodiment of the present invention provides a storage medium, where computer instructions are stored, where the computer instructions, when read and executed, perform the method provided in the first aspect.
According to the website information detection method, the website information detection device, the electronic equipment and the storage medium, the target network address to be detected is accessed through the preset web application test tool, wherein the driving engine of the web application test tool is a preset browser engine, whether a first access result returned by the server of the target website corresponding to the target network address is obtained or not is judged, when the first access result returned by the server of the target website corresponding to the target network address is obtained, whether preset sensitive information exists in the first access result or not is judged, and finally when the preset sensitive information exists in the first access result, it is judged that the sensitive information leakage exists in the target website. Therefore, the access result obtained by accessing the target network address is relatively complete, the accuracy for detecting whether sensitive information leakage exists in the website is improved, and the problem that the sensitive information leakage of the website is not accurately detected due to incomplete access result in the prior art is solved.
In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
FIG. 1 is a schematic diagram illustrating an electronic device interacting with a server according to an embodiment of the present invention;
FIG. 2 is a block diagram of an electronic device provided by an embodiment of the invention;
FIG. 3 is a flowchart illustrating a website information detection method according to an embodiment of the present invention;
fig. 4 is a flowchart illustrating step S130 in the website information detection method according to an embodiment of the present invention;
fig. 5 is a block diagram illustrating a website information detecting apparatus according to an embodiment of the present invention;
fig. 6 is a block diagram illustrating a second determining module in the website information detecting apparatus according to the embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present invention, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.
Fig. 1 is a schematic diagram illustrating interaction between a server and an electronic device according to an embodiment of the present invention. The server 200 is communicatively coupled to one or more electronic devices 100 over a network for data communication or interaction. The server 200 may be a web server or the like, and the server 200 may serve as a server of a target website in the embodiment of the present invention. The electronic device 100 may be a Personal Computer (PC), and the electronic device 100 may also be a server or the like.
Fig. 2 shows a block diagram of an electronic device applicable to an embodiment of the present invention. As shown in FIG. 2, electronic device 100 includes a memory 102, a memory controller 104, one or more processors 106 (only one shown), a peripherals interface 108, a radio frequency module 110, an audio module 112, a display unit 114, and the like. These components communicate with each other via one or more communication buses/signal lines 116.
The memory 102 may be used to store software programs and modules, such as program instructions/modules corresponding to the website information detection method and apparatus in the embodiments of the present invention, and the processor 106 executes various functional applications and data processing, such as the website information detection method provided in the embodiments of the present invention, by running the software programs and modules stored in the memory 102.
The memory 102 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. Access to the memory 102 by the processor 106, and possibly other components, may be under the control of the memory controller 104.
The peripheral interface 108 couples various input/output devices to the processor 106 as well as to the memory 102. In some embodiments, the peripheral interface 108, the processor 106, and the memory controller 104 may be implemented in a single chip. In other examples, they may be implemented separately from the individual chips.
The rf module 110 is used for receiving and transmitting electromagnetic waves, and implementing interconversion between the electromagnetic waves and electrical signals, so as to communicate with a communication network or other devices.
Audio module 112 provides an audio interface to a user that may include one or more microphones, one or more speakers, and audio circuitry.
The display unit 114 provides a display interface between the electronic device 100 and a user. In particular, display unit 114 displays video output to the user, the content of which may include text, graphics, video, and any combination thereof.
It will be appreciated that the configuration shown in FIG. 2 is merely illustrative and that electronic device 100 may include more or fewer components than shown in FIG. 2 or have a different configuration than shown in FIG. 2. The components shown in fig. 2 may be implemented in hardware, software, or a combination thereof.
First embodiment
Fig. 3 is a flowchart illustrating a website information detection method according to an embodiment of the present invention. Referring to fig. 3, the method includes:
step S110: and accessing the target network address to be detected by using a preset web application testing tool, wherein a driving engine of the web application testing tool is a preset browser engine.
In the existing website information detection method, because the crawler does not have a JS rendering related function, the obtained access result is incomplete when the target network address is accessed. Therefore, there is a need for improving the tools or ways of accessing a target network address with respect to the requirement of ensuring the integrity of the access results obtained by accessing the target network address.
The preset web application testing tool in the embodiment of the invention can be a Webdriver tool. Of course, the specific type of the preset web application testing tool is not limited in the embodiment of the present invention, and other tools, such as a Selenium tool, may be used.
In a web test application, the WebDriver tool is a piece of open source software which can control different browsers (such as Firefox, Chrome, Safari, IE) by defining a driving engine, and can open a URL to interact with a rendered page. The goal of WebDriver is to provide a well-designed set of object-oriented APIs to better support the testing efforts of modern high-level web applications.
In the embodiment of the invention, the preset browser engine can be selected as the driving engine of the Webdriver tool, so that the access to a network address can be realized.
The browser engine is the most important part of the browser, and has the function of JS rendering. The method is mainly used for interpreting the webpage syntax (such as an application HTML and a JavaScript under a standard general markup language) and rendering (displaying) the webpage.
In an embodiment of the present invention, the preset browser engine may be a PhantomJS engine. Of course, the specific type of the preset browser engine is not limited in the embodiment of the present invention. A Firefoxdriver engine, an Internet Explorer driver engine, a ChromeDriver engine, etc.
In the embodiment of the invention, PhantomJS is an interface-free, script-programmable WebKit browser engine. PhantomJS is a complete browser kernel, including a JS parsing engine, a rendering engine, request processing, and the like.
In the embodiment of the invention, a target network address list corresponding to a target website to be detected can be set. In addition, each target network address in the list may be accessed in a traversal manner.
In the embodiment of the invention, when the preset web application test tool is used for accessing the target network address to be detected, the method can be carried out in a mode of simulating the browser behavior, namely, the received behaviors of mouse operation, keyboard operation and the like are simulated, and the request for accessing the target network address is correspondingly generated and executed.
Step S120: and judging whether a first access result returned by the server of the target website corresponding to the target network address is acquired.
After the target network address is accessed, a first access result returned by a server of a target website corresponding to the accessed target network address is obtained in most cases. However, a situation that the first access result cannot be obtained may also occur, for example, the access request is rejected by the server of the target website corresponding to the target network address, so that the access result cannot be obtained.
Therefore, whether the first access result returned by the server of the target website corresponding to the target network address is acquired or not can be judged. To determine whether a subsequent step of sensitive information detection on the access result is performed.
In the embodiment of the present invention, the obtained first access result returned by the server of the target website corresponding to the target network address may have links in the first access result, and the links may be links related to the root domain name of the target network address accessed in step S110, that is, links belonging to the same website.
Accordingly, it can be determined whether the root domain name of the link in the first access result is the same as the root domain name of the target network address.
When it is determined whether the root domain name of the link in the first access result is the same as the root domain name of the target network address, the link that is the same as the root domain name of the target network address may be used as the first link, and the first link may be placed in the list of the target network address to be detected that needs to be accessed in step S110, so that the first link address is accessed subsequently, and thus the access result of the link that is the same as the root domain name of the target network address may be obtained, so that the detection of the website information is more complete.
Therefore, in the embodiment of the present invention, it may be determined whether to acquire the first access result returned by the server of the target website corresponding to the target network address, where the step S120 may include, for the above-mentioned access operation in step S110 for the network address identical to the root domain name of the target network address:
and judging whether a first access result returned by the server of the target website corresponding to the first link is obtained, wherein the first link is the same as the root domain name of the target network address.
It can be understood that, whether the server of the target website corresponding to the link with the same root domain name as the target network address returns the first access result is judged.
When the target network address to be detected is accessed, other requests are generated at the same time. For example, when a certain web page is accessed again, the user may be required to log in the page at the same time. The network address in these generated other requests may be a link related to the root domain name of the target network address accessed in step S110, i.e., an address belonging to the same website.
Thus, it may be determined whether the root domain name of the network address in the generated other request is the same as the root domain name of the target network address.
When it is determined whether the root domain name of the network address in the generated other request is the same as the root domain name of the target network address, the network address that is the same as the root domain name of the target network address may be used as the network address to be detected, and the network address to be detected may be placed in the list of the target network address to be detected that needs to be accessed in step S110, so that the network address to be detected is accessed subsequently, and thus the access result of the link that is the same as the root domain name of the target network address may be obtained, so that the detection of the website information is more complete.
Therefore, in the embodiment of the present invention, it may be determined whether to acquire the first access result returned by the server of the target website corresponding to the target network address, where the first access result is for the access operation in step S110 for the network address that is the same as the root domain name of the target network address, that is, step S120 may include:
and judging whether a first access result returned by a server of a target website corresponding to a network address corresponding to other network request information is acquired, wherein the network address corresponding to the other network request information is the network address with the same root domain name as the target network address.
It can be understood that it is determined whether the first access result returned by the server of the target website corresponding to the network address corresponding to the other network request information is obtained, where the root domain name of the network address corresponding to the other network request information is the same as the root domain name of the target network address in step S110.
Step S130: if so, judging whether preset sensitive information exists in the first access result.
When it is determined in step S120 that the first access result is obtained, a step of determining whether preset sensitive information exists in the first access result may be performed.
In an embodiment of the present invention, the first access result may include a plurality of sub-results. Referring to fig. 4, step S130 may include:
step S131: if yes, reading a matching rule corresponding to the preset sensitive information in the first database.
The first database for storing the characteristics of the sensitive information may be pre-stored with matching rules for determining whether the information is the preset sensitive information.
When the first access result is determined to be obtained, the matching rule corresponding to the preset sensitive information can be read from the first database.
Step S132: and judging whether the sub-results meeting the matching rule exist in the plurality of sub-results.
It is understood that the first access result may include a plurality of HTML strings, that is, the sub-result may be a HTML string, and of course, the sub-result may also be a plurality of HTML strings.
In the embodiment of the present invention, specific contents of the first access result and the sub-result are not limited in the embodiment of the present invention.
In the embodiment of the present invention, the matching rule may be a rule corresponding to a preset sensitive information. For example, the matching rule of the preset sensitive information corresponding to the mobile phone number may be 11 digits, the first digit is 1, and the like. For another example, the matching rule of the preset sensitive information corresponding to the email box may be a combination of english letters/numbers/symbols greater than 1 position + @ + a combination of english letters/numbers/symbols greater than 1 position +. + a combination of english letters/numbers/symbols greater than 1 position.
Thus, it can be determined whether there is a sub-result in the first access result that satisfies the matching rule.
Step S133: if yes, judging that the preset sensitive information exists in the first access result; if not, judging that the preset sensitive information does not exist in the first access result.
It can be understood that, when it is determined in step S132 that the sub-result meeting the matching rule exists in the first access result, it may be determined that the preset sensitive information exists in the first access result; when it is determined in step S132 that there is no sub-result satisfying the matching rule in the first access result, it may be determined that there is no preset sensitive information in the first access result.
In the embodiment of the present invention, when determining whether the first access result includes the preset sensitive information, since the preset sensitive information may be of multiple types, the first access result may be sequentially matched and determined with the matching rule corresponding to each type of the preset sensitive information, so that all the preset sensitive information included in the first access result may be accurately determined.
Step S140: and if so, judging that the sensitive information of the target website is leaked.
When the determination result in step S130 is that the preset sensitive information exists in the first access result, that is, when the target website corresponding to the target address is accessed, the preset sensitive information may be acquired, so that it may be determined that the sensitive information of the target website is leaked.
In the embodiment of the invention, after the sensitive information of the target website corresponding to the target network is determined to be leaked, the specific sensitive information can be recorded, and the corresponding relation between the sensitive information and the target network address is stored.
Therefore, in the embodiment of the present invention, the website information detection method may further include: acquiring corresponding content corresponding to the preset sensitive information in the first access result; and storing the corresponding relation between the corresponding content and the target network address in a database.
It can be understood that, when it is determined in step S130 whether the preset sensitive information exists in the first access result, the content meeting the matching rule corresponding to the preset sensitive information may be extracted, so as to obtain the corresponding content corresponding to the preset sensitive information in the first access result.
In the embodiment of the invention, after the sensitive information of the target website corresponding to the target network is determined to be leaked, a prompt message can be output to prompt the server of the website that the sensitive information is leaked.
In the website information detection method provided by the first embodiment of the present invention, a preset web application test tool is used to access a target network address to be detected, and a driving engine of the web application test tool is a browser engine, so that a complete access result can be effectively obtained, and in addition, a link related to the website obtained when the target network address is accessed or an address related to the website in other generated requests is accessed, so as to obtain more access results, thereby achieving a complete sensitive information leakage detection on the information of the website, improving the accuracy of detecting whether the sensitive information leakage exists in the website, and solving the problem that the detection of the sensitive information leakage of the website is inaccurate due to incomplete access results used for detection in the prior art.
Second embodiment
Referring to fig. 5, the website information detecting apparatus 300 according to a second embodiment of the present invention includes a simulation accessing module 310, a first determining module 320, a second determining module 330, and a first executing module 340. The simulation access module 310 is configured to access a target network address to be detected by using a preset web application test tool, where a driving engine of the web application test tool is a preset browser engine; the first determining module 320 is configured to determine whether to obtain a first access result returned by the server of the target website corresponding to the target network address; the second determining module 330 is configured to determine whether preset sensitive information exists in a first access result when the first access result returned by the server of the target website corresponding to the target network address is obtained; the first executing module 340 is configured to determine that the sensitive information of the target website is leaked when preset sensitive information exists in the first access result.
In an embodiment of the invention, the first access result comprises a plurality of sub-results. Referring to fig. 6, the second determining module 330 includes a rule reading unit 331, a rule determining unit 332, and a result determining unit 333. The rule reading unit 331 is configured to, when a first access result returned by a server of a target website corresponding to the target network address is obtained, read a matching rule corresponding to the preset sensitive information in a first database; the rule determining unit 332 is configured to determine whether there is a sub-result that satisfies the matching rule in the plurality of sub-results; the result determining unit 333 is configured to determine that the preset sensitive information exists in the first access result when a sub-result meeting the matching rule exists in the plurality of sub-results; the result determining unit is further configured to determine that the preset sensitive information does not exist in the first access result when a sub-result satisfying the matching rule does not exist in the plurality of sub-results.
In the embodiment of the present invention, the website information detecting apparatus 300 further includes a content obtaining module and a storage executing module. The content obtaining module is used for obtaining corresponding content corresponding to the preset sensitive information in the first access result; the storage execution module is used for storing the corresponding relation between the corresponding content and the target network address in a database.
In this embodiment of the present invention, the first determining module 320 may be specifically configured to determine whether to obtain a first access result returned by the server of the target website corresponding to a first link, where the first link is a link identical to a root domain name of the target network address.
In this embodiment of the present invention, the first determining module 320 may further be specifically configured to determine whether to obtain a first access result returned by the server of the target website corresponding to the network address corresponding to the other network request information, where the network address corresponding to the other network request information is a network address having the same root domain name as the target network address.
Third embodiment
Referring to fig. 2, the electronic device 100 according to a third embodiment of the present invention includes a memory 102 and a processor 106, where the memory 102 stores computer instructions, and when the computer instructions are read and executed by the processor 106, the processor 106 is caused to execute the website information detection method according to the first embodiment of the present invention.
Fourth embodiment
A fourth embodiment of the present invention provides a storage medium, in which a computer instruction is stored, where the computer instruction, when being read and executed, executes the website information detection method provided in the first embodiment of the present invention.
In summary, according to the website information detection method, the website information detection device, the electronic device, and the storage medium provided in the embodiments of the present invention, a preset web application test tool is used to access a target network address to be detected, where a driver engine of the web application test tool is a preset browser engine, and then it is determined whether a first access result returned by a server of the target website corresponding to the target network address is obtained, and when the first access result returned by the server of the target website corresponding to the target network address is obtained, it is determined whether preset sensitive information exists in the first access result, and finally, when the preset sensitive information exists in the first access result, it is determined that sensitive information leakage exists in the target website. Therefore, the access result obtained by accessing the target network address is relatively complete, the accuracy for detecting whether sensitive information leakage exists in the website is improved, and the problem that the sensitive information leakage of the website is not accurately detected due to incomplete access result in the prior art is solved.
It should be noted that, in the present specification, the embodiments are all described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments may be referred to each other. For the device-like embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, the functional modules in the embodiments of the present invention may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes. It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A website information detection method is characterized by comprising the following steps:
the method comprises the following steps of utilizing a preset web application testing tool to access a target network address to be detected, wherein a driving engine of the web application testing tool is a preset browser engine, and the web application testing tool comprises: the method for accessing the target network address to be detected by using the preset web application testing tool comprises the following steps: simulating browser behaviors by using a preset web application test tool, and correspondingly generating and executing a request for accessing the target network address, wherein the simulated browser behaviors comprise: simulating the action of receiving mouse operation or/and the action of receiving keyboard operation of the browser;
judging whether a first access result returned by a server of a target website corresponding to the target network address is acquired;
if so, judging whether preset sensitive information exists in the first access result;
and if so, judging that the sensitive information of the target website is leaked.
2. The method of claim 1, wherein the first access result comprises a plurality of sub-results, and if yes, determining whether preset sensitive information exists in the first access result comprises:
if yes, reading a matching rule corresponding to the preset sensitive information in the first database;
judging whether a sub-result meeting the matching rule exists in the plurality of sub-results;
if yes, judging that the preset sensitive information exists in the first access result;
if not, judging that the preset sensitive information does not exist in the first access result.
3. The method according to claim 1 or 2, wherein if yes, after determining that the target website has the sensitive information leaked, the method further comprises:
acquiring corresponding content corresponding to the preset sensitive information in the first access result;
and storing the corresponding relation between the corresponding content and the target network address in a database.
4. The method according to claim 3, wherein the determining whether to obtain the first access result returned by the server of the target website corresponding to the target network address comprises:
and judging whether a first access result returned by the server of the target website corresponding to the first link is obtained, wherein the first link is the same as the root domain name of the target network address.
5. The method according to claim 3, wherein the determining whether to obtain the first access result returned by the server of the target website corresponding to the target network address comprises:
and judging whether a first access result returned by a server of a target website corresponding to a network address corresponding to other network request information is acquired, wherein the network address corresponding to the other network request information is the network address with the same root domain name as the target network address.
6. The website information detection device is characterized by comprising a simulation access module, a first judgment module, a second judgment module and a first execution module, wherein,
the simulation access module is used for accessing a target network address to be detected by using a preset web application testing tool, wherein a driving engine of the web application testing tool is a preset browser engine, and the web application testing tool comprises: the method for accessing the target network address to be detected by using the preset web application testing tool comprises the following steps: simulating browser behaviors by using a preset web application test tool, and correspondingly generating and executing a request for accessing the target network address, wherein the simulated browser behaviors comprise: simulating the action of receiving mouse operation or/and the action of receiving keyboard operation of the browser;
the first judging module is used for judging whether a first access result returned by the server of the target website corresponding to the target network address is acquired;
the second judging module is used for judging whether preset sensitive information exists in a first access result when the first access result returned by the server of the target website corresponding to the target network address is obtained;
the first execution module is used for judging that the sensitive information of the target website is leaked when the preset sensitive information exists in the first access result.
7. The apparatus of claim 6, wherein the first access result comprises a plurality of sub-results, wherein the second determination module comprises a rule reading unit, a rule determination unit, and a result determination unit, wherein,
the rule reading unit is used for reading a matching rule corresponding to the preset sensitive information in a first database when a first access result returned by a server of a target website corresponding to the target network address is obtained;
the rule judging unit is used for judging whether a sub-result meeting the matching rule exists in the plurality of sub-results;
the result determining unit is used for judging that the preset sensitive information exists in the first access result when a sub-result meeting the matching rule exists in the plurality of sub-results;
the result determining unit is further configured to determine that the preset sensitive information does not exist in the first access result when a sub-result satisfying the matching rule does not exist in the plurality of sub-results.
8. The apparatus of claim 6, further comprising a content acquisition module and a storage execution module, wherein,
the content acquisition module is used for acquiring corresponding content corresponding to the preset sensitive information in the first access result;
the storage execution module is used for storing the corresponding relation between the corresponding content and the target network address in a database.
9. An electronic device, comprising a memory and a processor, the memory storing computer instructions that, when read and executed by the processor, cause the processor to perform the method of any of claims 1-5.
10. A storage medium having stored thereon computer instructions, wherein the computer instructions, when read and executed, perform the method of any one of claims 1-5.
CN201711107083.4A 2017-11-10 2017-11-10 Website information detection method and device, electronic equipment and storage medium Active CN107908959B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711107083.4A CN107908959B (en) 2017-11-10 2017-11-10 Website information detection method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711107083.4A CN107908959B (en) 2017-11-10 2017-11-10 Website information detection method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN107908959A CN107908959A (en) 2018-04-13
CN107908959B true CN107908959B (en) 2020-02-14

Family

ID=61844988

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711107083.4A Active CN107908959B (en) 2017-11-10 2017-11-10 Website information detection method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN107908959B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108898009A (en) * 2018-06-27 2018-11-27 平安科技(深圳)有限公司 A kind of anti-crawler method, terminal and computer-readable medium
CN109327436A (en) * 2018-09-27 2019-02-12 中国平安人寿保险股份有限公司 Safety detecting method, device, computer equipment and storage medium
CN109800378A (en) * 2019-01-23 2019-05-24 北京字节跳动网络技术有限公司 Content processing method, device and electronic equipment based on custom browser
CN110719274B (en) * 2019-09-29 2022-10-04 武汉极意网络科技有限公司 Network security control method, device, equipment and storage medium
CN111723400A (en) * 2020-06-16 2020-09-29 杭州安恒信息技术股份有限公司 JS sensitive information leakage detection method, device, equipment and medium
CN112000984A (en) * 2020-08-24 2020-11-27 杭州安恒信息技术股份有限公司 Data leakage detection method, device, equipment and readable storage medium
CN112671849A (en) * 2020-12-08 2021-04-16 北京健康之家科技有限公司 Sensitive data processing method and device based on real-time flow analysis
CN112653674B (en) * 2020-12-10 2023-01-10 奇安信网神信息技术(北京)股份有限公司 Interface security detection method and device, electronic equipment and storage medium
CN113779585A (en) * 2021-01-04 2021-12-10 北京沃东天骏信息技术有限公司 Unauthorized vulnerability detection method and device
CN114006776B (en) * 2021-12-31 2022-03-18 北京微步在线科技有限公司 Sensitive information leakage detection method and device

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101242279A (en) * 2008-03-07 2008-08-13 北京邮电大学 Automatic penetration testing system and method for WEB system
CN101799855A (en) * 2010-03-12 2010-08-11 北京大学 Simulated webpage Trojan detecting method based on ActiveX component
CN101808093A (en) * 2010-03-15 2010-08-18 北京安天电子设备有限公司 System and method for automatically detecting WEB security
CN102880830A (en) * 2011-07-15 2013-01-16 华为软件技术有限公司 Acquisition method and device of original test data
CN103699480A (en) * 2013-11-29 2014-04-02 杭州安恒信息技术有限公司 WEB dynamic security flaw detection method based on JAVA
CN103942497A (en) * 2013-09-11 2014-07-23 杭州安恒信息技术有限公司 Forensics type website vulnerability scanning method and system
CN104200166A (en) * 2014-08-05 2014-12-10 杭州安恒信息技术有限公司 Script-based website vulnerability scanning method and system
CN106326734A (en) * 2015-06-30 2017-01-11 阿里巴巴集团控股有限公司 Method and device for detecting sensitive information
CN106789877A (en) * 2016-11-15 2017-05-31 杭州安恒信息技术有限公司 A kind of validating vulnerability system based on sandbox
CN106845248A (en) * 2017-01-18 2017-06-13 北京工业大学 A kind of XSS leak detection methods based on state transition graph

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101242279A (en) * 2008-03-07 2008-08-13 北京邮电大学 Automatic penetration testing system and method for WEB system
CN101799855A (en) * 2010-03-12 2010-08-11 北京大学 Simulated webpage Trojan detecting method based on ActiveX component
CN101808093A (en) * 2010-03-15 2010-08-18 北京安天电子设备有限公司 System and method for automatically detecting WEB security
CN102880830A (en) * 2011-07-15 2013-01-16 华为软件技术有限公司 Acquisition method and device of original test data
CN103942497A (en) * 2013-09-11 2014-07-23 杭州安恒信息技术有限公司 Forensics type website vulnerability scanning method and system
CN103699480A (en) * 2013-11-29 2014-04-02 杭州安恒信息技术有限公司 WEB dynamic security flaw detection method based on JAVA
CN104200166A (en) * 2014-08-05 2014-12-10 杭州安恒信息技术有限公司 Script-based website vulnerability scanning method and system
CN106326734A (en) * 2015-06-30 2017-01-11 阿里巴巴集团控股有限公司 Method and device for detecting sensitive information
CN106789877A (en) * 2016-11-15 2017-05-31 杭州安恒信息技术有限公司 A kind of validating vulnerability system based on sandbox
CN106845248A (en) * 2017-01-18 2017-06-13 北京工业大学 A kind of XSS leak detection methods based on state transition graph

Also Published As

Publication number Publication date
CN107908959A (en) 2018-04-13

Similar Documents

Publication Publication Date Title
CN107908959B (en) Website information detection method and device, electronic equipment and storage medium
US10613971B1 (en) Autonomous testing of web-based applications
CN108089974A (en) Using the input format of definition come test application
WO2014101783A1 (en) Method and server for performing cloud detection for malicious information
CN110688307B (en) JavaScript code detection method, device, equipment and storage medium
CN104956362A (en) Analyzing structure of web application
CN107612908B (en) Webpage tampering monitoring method and device
US20200401646A1 (en) Method for facilitating identification of navigation regions in a web page based on document object model analysis
CN109684584B (en) Intelligent switching method and device for browser kernel, terminal and storage medium
CA3120833C (en) Identifying equivalent links on a page
US9733906B2 (en) User interface area coverage
CN103297394A (en) Website security detection method and device
US20160171104A1 (en) Detecting multistep operations when interacting with web applications
JP6230725B2 (en) Causal relationship analysis apparatus and causal relationship analysis method
CN102591965A (en) Method and device for detecting black chain
US20150143342A1 (en) Functional validation of software
US11650579B2 (en) Information processing device, production facility monitoring method, and computer-readable recording medium recording production facility monitoring program
CN111125704B (en) Webpage Trojan horse recognition method and system
JP6508327B2 (en) Text visualization system, text visualization method, and program
Patidar et al. Detection of cross browser inconsistency by comparing extracted attributes
CN112732589A (en) Control testing method, device, equipment and storage medium
CN111125605B (en) Page element acquisition method and device
KR20190020363A (en) Method and apparatus for analyzing program by associating dynamic analysis with static analysis
CN113836899A (en) Webpage identification method and device, electronic equipment and storage medium
CN110147477B (en) Data resource modeling extraction method, device and equipment of Web system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: Room 311501, Unit 1, Building 5, Courtyard 1, Futong East Street, Chaoyang District, Beijing

Applicant after: Beijing Zhichuangyu Information Technology Co., Ltd.

Address before: Room 803, Jinwei Building, 55 Lanindichang South Road, Haidian District, Beijing

Applicant before: Beijing Knows Chuangyu Information Technology Co.,Ltd.

GR01 Patent grant
GR01 Patent grant