CN109167757B - Vulnerability detection method of web application, terminal and computer readable medium - Google Patents

Vulnerability detection method of web application, terminal and computer readable medium Download PDF

Info

Publication number
CN109167757B
CN109167757B CN201810854861.4A CN201810854861A CN109167757B CN 109167757 B CN109167757 B CN 109167757B CN 201810854861 A CN201810854861 A CN 201810854861A CN 109167757 B CN109167757 B CN 109167757B
Authority
CN
China
Prior art keywords
web application
information
similarity
web
html
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810854861.4A
Other languages
Chinese (zh)
Other versions
CN109167757A (en
Inventor
周东旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201810854861.4A priority Critical patent/CN109167757B/en
Priority to PCT/CN2018/108673 priority patent/WO2020019511A1/en
Publication of CN109167757A publication Critical patent/CN109167757A/en
Application granted granted Critical
Publication of CN109167757B publication Critical patent/CN109167757B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1433Vulnerability analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

A vulnerability detection method, a terminal and a computer readable medium of a web application are provided, the method comprises the following steps: analyzing the obtained web application to obtain web component information of the web application; judging whether the web component information changes or not according to the web component information and historical web component information of the web application in a historical record; if the web component information is judged to be changed, determining that the web application is changed; if the web component information of the web application is judged not to be changed, acquiring content information of a page corresponding to the web application, and determining whether the web application is changed or not according to the content information; when the web application is determined to be changed, calling a preset vulnerability scanning engine to carry out vulnerability detection on the changed web application, so that vulnerability detection on the web application is realized, and vulnerability detection efficiency is improved.

Description

Vulnerability detection method of web application, terminal and computer readable medium
Technical Field
The present invention relates to the field of communications technologies, and in particular, to a vulnerability detection method for web applications, a terminal, and a computer-readable medium.
Background
At present, a method mainly adopted for monitoring and vulnerability scanning detection of web applications is to perform vulnerability scanning on all data information corresponding to the web applications, however, the vulnerability scanning mode cannot timely find new vulnerability data information, and the scanning efficiency is low by performing vulnerability scanning on all data information. Therefore, how to improve vulnerability scanning efficiency becomes a hot spot for security detection research.
Disclosure of Invention
The embodiment of the invention provides a vulnerability detection method, a terminal and a computer readable medium for web application, which can realize vulnerability detection on the web application and improve vulnerability detection efficiency.
In a first aspect, an embodiment of the present invention provides a method for detecting a vulnerability of a web application, where the method includes:
analyzing the obtained Web application to obtain Web component information of the Web application, wherein the Web component information is entity information for encapsulating data and methods of the Web application;
judging whether the web component information of the web application changes or not according to the web component information and historical web component information of the web application in a historical record, wherein the historical record stores the historical web component information of the web application;
if the web component information of the web application is judged to be changed, determining that the web application is changed;
if the web component information of the web application is judged not to be changed, acquiring content information of a page corresponding to the web application, and determining whether the web application is changed or not according to the content information;
and calling a preset vulnerability scanning engine to carry out vulnerability detection on the changed web application when the web application is determined to be changed.
Further, the determining whether the web application is changed according to the content information includes:
acquiring information of the HTML tag corresponding to the web application;
judging whether an HTML page corresponding to the HTML tag changes or not according to the information of the HTML tag;
and if the HTML page is judged to be changed, determining that the web application is changed.
Further, the information of the HTML tag includes a plurality of HTML tags; the judging whether the HTML page corresponding to the HTML tag changes or not according to the information of the HTML tag comprises the following steps:
generating a structure tree of the HTML labels according to the HTML labels;
determining a difference value between the structural tree of the HTML tag and the structural tree of the HTML tag corresponding to the web application recorded in the historical record at the last time;
and if the difference value is larger than a preset threshold value, determining that the HTML page changes.
Further, the determining a difference value between the structure tree of the HTML tag and the structure tree of the HTML tag corresponding to the web application recorded last time in the history record includes:
acquiring a structure difference value and a tag difference value of the structure tree of the HTML tag and the structure tree of the HTML tag corresponding to the web application recorded in the historical record at the last time;
wherein the structure difference value comprises: difference values of the levels of the structure tree or difference of the number of child nodes of the structure tree; the tag difference values include: any one or more of a difference value of the number of tags, a difference value of the ID of the tag and a difference value of the class of the tag;
and determining the difference value between the structure tree of the HTML label and the structure tree of the HTML label corresponding to the web application recorded in the history record at the latest time according to the structure difference value, the preset structure difference value weight and the label difference value and the preset label difference value weight.
Further, the determining whether the web application is changed according to the content information includes:
according to a preset similarity algorithm, determining the similarity between the content information and the content information obtained by scanning the web application in the historical record for the last time;
judging whether the similarity is greater than a preset threshold value or not;
if the similarity is judged to be larger than the preset threshold value, determining that the web application is not changed;
and if the similarity is judged to be less than or equal to the preset threshold value, determining that the web application is changed.
Further, the content information includes text information and path information; the determining the similarity between the content information and the content information obtained by scanning the web application for the last time in the history record according to a preset similarity algorithm includes:
determining the text similarity between the text information and the text information obtained by scanning the web application for the last time in the history record according to a preset similarity algorithm;
determining the path similarity between the path information and the path information obtained by scanning the web application for the last time in the history record according to a preset similarity algorithm;
and determining the similarity of the content information according to the text similarity and the path similarity.
Further, the determining the similarity of the content information according to the text similarity and the path similarity includes:
and determining the similarity of the content information according to the text similarity, a preset text similarity weight and the path similarity and a preset path similarity weight.
In a second aspect, an embodiment of the present invention provides a terminal, where the terminal includes a unit configured to perform the method of the first aspect.
In a third aspect, an embodiment of the present invention provides another terminal, which includes a processor, an input device, an output device, and a memory, where the processor, the input device, the output device, and the memory are connected to each other, where the memory is used to store a computer program that supports the terminal to execute the foregoing method, and the computer program includes program instructions, and the processor is configured to call the program instructions to execute the foregoing method according to the first aspect.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, in which a computer program is stored, the computer program comprising program instructions, which, when executed by a processor, cause the processor to perform the method of the first aspect.
According to the method and the device for detecting the vulnerability of the web application, the web component information of the web application is obtained by analyzing the obtained web application, if the web component information is judged to be changed, the web application is determined to be changed, if the web component information is judged not to be changed, the content information of the page corresponding to the web application is obtained, when the web application is determined to be changed, a preset vulnerability scanning engine is called to detect the vulnerability of the changed web application, and therefore vulnerability detection of the web application is achieved, and vulnerability detection efficiency is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic flowchart of a vulnerability detection method for a web application according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of another vulnerability detection method for web applications according to an embodiment of the present invention;
fig. 3 is a schematic flowchart of another vulnerability detection method for a web application according to an embodiment of the present invention;
fig. 4 is a schematic block diagram of a terminal according to an embodiment of the present invention;
fig. 5 is a schematic block diagram of another terminal according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It is to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
The vulnerability detection method of the web application provided by the embodiment of the invention can be executed by a terminal, and the terminal can be an intelligent terminal such as a mobile phone, a computer, a tablet, an intelligent watch and the like. The following illustrates a vulnerability detection method applied to a web application of a terminal.
In the embodiment of the present invention, the terminal may analyze the acquired web application to obtain the web component information of the web application, where the web application may be downloaded by the user on the terminal or acquired by the terminal according to the IP address, and the manner of acquiring the web application is not specifically limited in the embodiment of the present invention. In some embodiments, the Web component information is entity information that encapsulates the data and methods of the Web application. The terminal can judge whether the web component information changes according to the web component information and historical web component information of the web application in the historical record, if the web component information changes, the web application can be determined to change, if the web component information does not change, content information of a page corresponding to the web application can be obtained, whether the web application changes is determined according to the content information, and when the web application changes is determined, the terminal can call a preset vulnerability scanning engine to conduct vulnerability detection on the changed web application, so that the vulnerability detection efficiency is improved. The following describes embodiments of the present invention in detail with reference to the accompanying drawings.
Referring to fig. 1, fig. 1 is a schematic flowchart of a vulnerability detection method for a web application according to an embodiment of the present invention, and as shown in fig. 1, the method may be executed by a terminal, and a specific explanation of the terminal is as described above and is not described herein again. Specifically, the method of the embodiment of the present invention includes the following steps.
S101: and analyzing the acquired web application to obtain the web component information of the web application.
In this embodiment of the present invention, a terminal may analyze an acquired Web application to obtain Web component information of the Web application, where the Web component information is entity information encapsulating data and methods of the Web application, and in some embodiments, the Web component information refers to information of a Web component and/or information of a Web component version, and the Web component may include: python component, JavaBean component, php component, etc. In a specific implementation process, the terminal may invoke a web scanning tool to scan the acquired web application according to a preset detection period, so as to acquire web component information used by the web application. For example, assuming that the preset detection period is 1 day, that is, 24h, the web scanning tool is called to scan the acquired web application to acquire the web component and/or the web component version used by the web application with the 24h as the period.
S102: and judging whether the web component information changes or not according to the web component information and the historical web component information of the web application in the historical record, if so, executing step S103, and if not, executing step S104.
In the embodiment of the present invention, the terminal may determine whether the web component information changes according to the web component information and historical web component information of the web application in a history record, where the history record stores the historical web component information of the web application. In a specific implementation process, the terminal may compare, according to the obtained web component information used by the web application, the web component information used by the web application with historical web component information obtained by scanning the web application last time in a history record, and may determine that the web application has changed if the web component information of the web application changes in a comparison result, where the change in the web component information may include a change in a web component and/or a change in a web component version.
Specifically, for example, it is assumed that the web application acquired by the terminal is a blog, and if the web component used by the blog in the scanning result obtained by scanning the blog for the last time in the history is acquired as wordpress and the web component used by the blog obtained by scanning the blog currently is phpCms, the terminal may determine that the web component of the blog is changed, that is, may determine that the web component information of the web application is changed. For another example, if the web application acquired by the terminal is a website, the development language in the web component information of the web application, which is acquired from the history and recorded last time, is PHP, and the development language in the web component information of the web application, which is acquired currently, is Python, the terminal may determine that the web component information is changed. For another example, if the web application acquired by the terminal is a website, the nginx version in the web component information of the web application, which is recorded last in the acquisition history, is 1.1, and the nginx version in the web component information of the web application, which is acquired currently, is 1.2, the terminal may determine that the web component information changes.
S103: it is determined that the web application is changed, and step S105 is performed.
In this embodiment of the present invention, if the terminal determines that the web component information changes, it may determine that the web application changes, and execute step S105.
S104: and acquiring content information of a page corresponding to the web application, and determining whether the web application changes according to the content information.
In the embodiment of the invention, if the terminal judges that the web component information does not change, the terminal can acquire the content information of the page corresponding to the web application and determine whether the web application changes according to the content information.
In an embodiment, the content information includes information of an HTML tag, and the terminal may determine whether the HTML page corresponding to the HTML tag changes by acquiring the information of the HTML tag corresponding to the web application when determining whether the web application changes according to the content information, and determine that the web application changes if the HTML page changes.
In an embodiment, the terminal may determine, according to a preset similarity algorithm, a similarity between the content information and content information obtained by scanning the web application most recently in the history, determine whether the similarity is greater than a preset threshold, determine that the web application is not changed if the similarity is greater than the preset threshold, and determine that the web application is changed if the similarity is less than or equal to the preset threshold.
In an embodiment, the content information includes text information and path information, and the terminal may determine, according to a preset similarity algorithm, a text similarity between the text information and text information obtained by scanning the web application last time in the history, and determine, according to the preset similarity algorithm, a path similarity between the path information and path information obtained by scanning the web application last time in the history, and further determine, according to the text similarity and the path similarity, a similarity between the content information. In some embodiments, the terminal may determine the similarity of the content information according to the text similarity and a preset text similarity weight, and the path similarity and a preset path similarity weight.
In an embodiment, the terminal may calculate the similarity of content information corresponding to the web application by using a preset similarity algorithm, and if the calculated similarity is smaller than a preset threshold, may determine whether an HTML page corresponding to an HTML tag changes by obtaining information of the HTML tag corresponding to the web application and according to the information of the HTML tag, and if the HTML page changes, determine that the web application changes. In some embodiments, the methods for determining whether the web application changes may be combined with each other for determination, and the order of determination by the determination method is not specifically limited in the embodiments of the present invention.
S105: and calling a preset vulnerability scanning engine to carry out vulnerability detection on the changed web application when the web application is determined to be changed.
In the embodiment of the invention, when the terminal determines that the web application changes, the preset vulnerability scanning engine can be called to carry out vulnerability detection on the changed web application, so that vulnerability detection on all the obtained web applications is avoided, and vulnerability detection efficiency is improved.
In the embodiment of the invention, the terminal can analyze the acquired web application to obtain the web component information of the web application, if the web component information is judged to be changed, the web application is determined to be changed, if the web component information is judged not to be changed, the content information of the page corresponding to the web application is acquired, and when the web application is determined to be changed, a preset vulnerability scanning engine is called to carry out vulnerability detection on the changed web application, so that vulnerability detection on all the acquired web applications is avoided, and vulnerability detection efficiency is improved.
Referring to fig. 2, fig. 2 is a schematic flowchart of another web application vulnerability detection method according to an embodiment of the present invention, and as shown in fig. 2, the method may be executed by a terminal, and a specific explanation of the terminal is as described above, which is not described herein again. The difference between the embodiment of the present invention and the embodiment described in fig. 1 is that the embodiment of the present invention determines whether the web application changes according to the information of the HTML tag of the web application, so as to perform vulnerability detection on the web application that changes. Specifically, the method of the embodiment of the present invention includes the following steps.
S201: and analyzing the acquired web application to obtain the web component information of the web application.
In the embodiment of the present invention, the terminal may analyze the acquired web application to obtain the web component information of the web application, where the explanation of the web component information is as described above and is not described herein again.
S202: and if the web component information is judged to be unchanged, acquiring the information of the HTML label of the page corresponding to the web application.
In the embodiment of the invention, if the terminal judges that the web component information does not change, the terminal can acquire the information of the HTML label of the page corresponding to the web application.
S203: and judging whether the HTML page corresponding to the HTML tag changes or not according to the information of the HTML tag.
In the embodiment of the invention, the terminal can judge whether the HTML page corresponding to the HTML tag changes or not according to the information of the HTML tag.
In an embodiment, the information of the HTML tag may include a plurality of HTML tags, the terminal may generate a structure tree of the HTML tag according to the plurality of HTML tags, determine a difference value between the structure tree of the HTML tag and a structure tree of an HTML tag corresponding to the web application recorded in the history last time, and if it is determined that the difference value is greater than a preset threshold, the terminal may determine that the HTML page is changed.
In one embodiment, the terminal may obtain a structure difference value and a tag difference value of the structure tree of the HTML tag and the structure tree of the HTML tag corresponding to the web application recorded last time in the history. Wherein the structure difference value comprises: difference values of the levels of the structure tree or difference of the number of child nodes of the structure tree; the tag difference values include: any one or more of a difference value of the number of tags, a difference value of the tag ID, and a difference value of the tag class.
In an embodiment, the terminal may generate a structure tree of the HTML tag according to the plurality of HTML tags, determine a difference value between the structure tree of the HTML tag and a structure tree of an HTML tag corresponding to the web application recorded last time in a history, and if it is determined that the difference value is greater than a preset threshold, the terminal may determine that the HTML page is changed.
In an embodiment, the terminal may generate a structure tree of the HTML tag according to the plurality of HTML tags, and determine a structure difference value and/or a tag difference value between the structure tree of the HTML tag and the structure tree of the HTML tag corresponding to the web application recorded last time in the history. In some embodiments, the terminal may determine, according to the structure difference value and a preset weight of the structure difference value, and the tag difference value and a preset weight of the tag difference value, a difference value between the structure tree of the HTML tag and the structure tree of the HTML tag corresponding to the web application recorded last time in the history. In some embodiments, the terminal may determine, according to the structure difference value and a preset weight of the structure difference value, or the tag difference value and a preset weight of the tag difference value, a difference value between the structure tree of the HTML tag and the structure tree of the HTML tag corresponding to the web application recorded last time in the history.
Specifically, for example, it is assumed that the information of the HTML tag of the HTML page corresponding to the web application currently acquired by the terminal is: the method comprises the steps that < HTML > < div > < p > < span > aaaa </span > </div > </HTML >, according to the information of the HTML tags, the information of the HTML tags comprises a plurality of HTML tags, and if the plurality of HTML tags are obtained, a structure tree related to the HTML tags is generated, wherein the structure tree related to the HTML tags comprises the following steps: HTML- > div- > p- > span, if the structure tree of the information of the HTML tag of the HTML page corresponding to the web application in the history record is HTML- > span, the terminal can determine that the difference value of the structure tree level in the structure difference values of the structure tree of the HTML tag and the structure tree of the HTML tag corresponding to the web application recorded most recently in the history record is 2, and determine that the difference value of the tag number in the tag difference values of the structure tree of the HTML tag and the structure tree of the HTML tag corresponding to the web application recorded most recently in the history record is 2. Assuming that the weight of the preset structure difference value is 0.5, the weight of the preset tag difference value is 0.6, and the preset threshold value is 4, it may be determined that the difference value is 2 × 0.5+2 × 0.6 — 2.2, and therefore, the terminal may determine that the HTML page is changed if the difference value is greater than the preset threshold value.
S204: and if the HTML page is judged to be changed, determining that the web application is changed.
In the embodiment of the invention, if the terminal judges that the HTML page changes, the change of the web application can be determined.
S205: and calling a preset vulnerability scanning engine to detect the vulnerability of the changed web application.
In the embodiment of the invention, when the terminal determines that the web application changes, a preset vulnerability scanning engine can be called to carry out vulnerability detection on the changed web application. The preset vulnerability scanning engine can be an existing vulnerability scanning engine, and the preset vulnerability scanning engine is not specifically limited in the embodiment of the invention. By the implementation mode, vulnerability detection can be avoided for all the acquired web applications, and vulnerability detection efficiency is improved.
In the embodiment of the invention, a terminal can analyze the acquired web application to obtain the web component information of the web application, if the web component information is judged to be unchanged, the information of an HTML (hypertext markup language) label of a page corresponding to the web application is acquired, whether the HTML page corresponding to the HTML label is changed or not is judged according to the information of the HTML label, if the HTML page is judged to be changed, the web application is determined to be changed, and a preset vulnerability scanning engine is called to carry out vulnerability detection on the changed web application. By the method, vulnerability detection can be performed on the web application with the changed HTML page, so that vulnerability detection efficiency is improved.
Referring to fig. 3, fig. 3 is a schematic flowchart of another web application vulnerability detection method according to an embodiment of the present invention, and as shown in fig. 3, the method may be executed by a terminal, and a specific explanation of the terminal is as described above and is not described herein again. The difference between the embodiment of the present invention and the embodiment described in fig. 2 is that the embodiment of the present invention determines whether the web application changes according to the text information and the path information of the web application, so as to perform vulnerability detection on the web application that changes. Specifically, the method of the embodiment of the present invention includes the following steps.
S301: and acquiring content information of a page corresponding to the web application.
In the embodiment of the invention, the terminal can call a web scanning tool to scan the acquired web application according to a preset detection period so as to acquire the web component information used by the web application, and if the web component information is judged to be unchanged, the content information of the page corresponding to the web application is acquired. The content information includes any one or more of information of an HTML tag, text information, and path information.
In some embodiments, the content information includes text information and/or path information, the text information may be text content information in a page corresponding to the web application, and the path information may be HTML relative path information and/or HTML absolute path information in a page corresponding to the web application. In some embodiments, the HTML relative path refers to a path relationship with other files (or folders) caused by a path corresponding to each file in the intranet website; the HTML absolute path refers to a complete path with a domain name corresponding to each file in the darknet website.
S302: and determining the similarity between the content information and the content information obtained by scanning the web application in the historical record at the latest time according to a preset similarity algorithm.
In the embodiment of the invention, the terminal can determine the similarity between the content information and the content information obtained by scanning the web application in the historical record at the latest time according to a preset similarity algorithm. The preset similarity algorithm is not specifically limited in the embodiments of the present invention.
In an embodiment, the terminal may determine, according to a preset similarity algorithm, a text similarity between the text information and the text information obtained by scanning the web application most recently in the history, determine, according to the preset similarity algorithm, a path similarity between the path information and the path information obtained by scanning the web application most recently in the history, and determine the similarity of the content information according to the text similarity and the path similarity.
In an embodiment, the terminal may determine the similarity of the content information according to the text similarity and a preset text similarity weight, and the path similarity and a preset path similarity weight. In a specific implementation process, the terminal may add the product of the text similarity and the preset text similarity weight to the product of the path similarity and the preset path similarity weight according to the product of the text similarity and the preset text similarity weight and the product of the path similarity and the preset path similarity weight, and determine the similarity of the content information according to a value obtained by the addition.
The specific real-time process may be illustrated by assuming that a preset similarity algorithm is a simhash algorithm, according to the simhash algorithm, calculating a text similarity a between text information of a page corresponding to the web application and text information of a page corresponding to the web application obtained by the latest scanning in the history, calculating a path similarity B between path information of a page corresponding to the web application and path information of a page corresponding to the web application obtained by the latest scanning in the history, and then according to a preset text similarity weight x and a preset path similarity weight y, according to a calculation formula: and C, calculating the similarity C between the content information corresponding to the web application and the content information of the web application obtained by the last scanning in the history record, and determining whether the web application changes or not according to the similarity C.
S303: and judging whether the similarity is larger than a preset threshold value, if so, executing the step S304, and if not, executing the step S305.
In this embodiment of the present invention, the terminal may determine whether the similarity is greater than a preset threshold, if the similarity is greater than the preset threshold, execute step S304, and if the similarity is less than or equal to the preset threshold, execute step S305.
For example, assuming that the preset threshold is 0.7, if the text similarity between the text information of the page corresponding to the web application and the text information of the page corresponding to the web application obtained by the last scanning in the history record is 0.6 through a preset similarity calculation method, and the path similarity between the path information of the page corresponding to the web application and the path information of the page corresponding to the web application obtained by the last scanning in the history record is 0.8 through calculation, if the preset text similarity weight is 0.5, the preset path similarity weight is also 0.5, the similarity between the content information corresponding to the web application and the content information of the web application obtained by the last scanning in the history record is 0.5 × 0.6+0.5 × 0.8 — 0.7, and since 0.7 is equal to the preset threshold, it can be determined that the web application has not changed.
S304: determining that the web application has not changed.
In the embodiment of the present invention, if the terminal determines that the similarity is greater than the preset threshold, it may be determined that the web application is not changed.
S305: and determining that the web application changes, and calling a preset vulnerability scanning engine to perform vulnerability detection on the changed web application.
In the embodiment of the invention, when the terminal judges that the similarity is less than or equal to the preset threshold, the change of the web application can be determined, and a preset vulnerability scanning engine is called to carry out vulnerability detection on the changed web application.
The embodiment of the invention also provides a terminal, which is used for executing the unit of the method in any one of the preceding claims. Specifically, referring to fig. 4, fig. 4 is a schematic block diagram of a terminal according to an embodiment of the present invention. The terminal of the embodiment includes: analysis section 401, determination section 402, first determination section 403, second determination section 404, and detection section 405.
The analysis unit 401 is configured to analyze the acquired Web application to obtain Web component information of the Web application, where the Web component information is entity information for encapsulating data and methods of the Web application;
a determining unit 402, configured to determine whether web component information of the web application changes according to the web component information and historical web component information of the web application in a history;
a first determining unit 403, configured to determine that the web application changes if it is determined that web component information of the web application changes, where history web component information of the web application is stored in the history record;
a second determining unit 404, configured to, if it is determined that the web component information of the web application does not change, obtain content information of a page corresponding to the web application, and determine whether the web application changes according to the content information;
and the detecting unit 405 is configured to, when it is determined that the web application changes, invoke a preset vulnerability scanning engine to perform vulnerability detection on the changed web application.
Further, the content information includes information of a hypertext markup language (HTML) tag,
when determining whether the web application changes according to the content information, the second determining unit 404 is specifically configured to perform the following steps:
acquiring information of the HTML tag corresponding to the web application;
judging whether an HTML page corresponding to the HTML tag changes or not according to the information of the HTML tag;
and if the HTML page is judged to be changed, determining that the web application is changed.
Further, the information of the HTML tag includes a plurality of HTML tags;
the second determining unit 404 is specifically configured to, when determining whether the HTML page corresponding to the HTML tag changes according to the information of the HTML tag, perform the following steps:
generating a structure tree of the HTML labels according to the HTML labels;
determining a difference value between the structural tree of the HTML tag and the structural tree of the HTML tag corresponding to the web application recorded in the historical record at the last time;
and if the difference value is larger than a preset threshold value, determining that the HTML page changes.
Further, when the second determining unit 404 determines a difference value between the structure tree of the HTML tag and the structure tree of the HTML tag corresponding to the web application recorded last in the history, it is specifically configured to perform the following steps:
acquiring a structure difference value and a tag difference value of the structure tree of the HTML tag and the structure tree of the HTML tag corresponding to the web application recorded in the historical record at the last time;
wherein the structure difference value comprises: difference values of the levels of the structure tree or difference of the number of child nodes of the structure tree; the tag difference values include: any one or more of a difference value of the number of tags, a difference value of the ID of the tag and a difference value of the class of the tag;
and determining the difference value between the structure tree of the HTML label and the structure tree of the HTML label corresponding to the web application recorded in the history record at the latest time according to the structure difference value, the preset structure difference value weight and the label difference value and the preset label difference value weight.
Further, when the second determining unit 404 determines whether the web application changes according to the content information, it is specifically configured to perform the following steps:
according to a preset similarity algorithm, determining the similarity between the content information and the content information obtained by scanning the web application in the historical record for the last time;
judging whether the similarity is greater than a preset threshold value or not;
if the similarity is judged to be larger than the preset threshold value, determining that the web application is not changed;
and if the similarity is judged to be less than or equal to the preset threshold value, determining that the web application is changed.
Further, the content information includes text information and path information; when the second determining unit 404 determines, according to a preset similarity algorithm, a similarity between the content information and content information obtained by scanning the web application last time in the history, the second determining unit is specifically configured to perform the following steps:
determining the text similarity between the text information and the text information obtained by scanning the web application for the last time in the history record according to a preset similarity algorithm;
determining the path similarity between the path information and the path information obtained by scanning the web application for the last time in the history record according to a preset similarity algorithm;
and determining the similarity of the content information according to the text similarity and the path similarity.
Further, when determining the similarity of the content information according to the text similarity and the path similarity, the second determining unit 404 is specifically configured to execute the following steps:
and determining the similarity of the content information according to the text similarity, a preset text similarity weight and the path similarity and a preset path similarity weight.
In the embodiment of the invention, the terminal can analyze the acquired web application to obtain the web component information of the web application, if the web component information is judged to be changed, the web application is determined to be changed, if the web component information is judged not to be changed, the content information of the page corresponding to the web application is acquired, and when the web application is determined to be changed, a preset vulnerability scanning engine is called to carry out vulnerability detection on the changed web application, so that vulnerability detection on all the acquired web applications is avoided, and vulnerability detection efficiency is improved.
Referring to fig. 5, fig. 5 is a schematic block diagram of another terminal provided in the embodiment of the present invention. The terminal in this embodiment as shown in the figure may include: one or more processors 501; one or more input devices 502, one or more output devices 503, and memory 504. The processor 501, the input device 402, the output device 503, and the memory 504 are connected by a bus 505. The memory 504 is used to store a computer program comprising program instructions and the processor 501 is used to execute the program instructions stored by the memory 504. Wherein the processor 501 is configured to perform the following steps:
analyzing the obtained Web application to obtain Web component information of the Web application, wherein the Web component information is entity information for encapsulating data and methods of the Web application;
judging whether the web component information of the web application changes or not according to the web component information and historical web component information of the web application in a historical record, wherein the historical record stores the historical web component information of the web application;
if the web component information of the web application is judged to be changed, determining that the web application is changed;
if the web component information of the web application is judged not to be changed, acquiring content information of a page corresponding to the web application, and determining whether the web application is changed or not according to the content information;
and calling a preset vulnerability scanning engine to carry out vulnerability detection on the changed web application when the web application is determined to be changed.
Further, the content information includes information of a hypertext markup language HTML tag, and the processor 501 is configured to perform the following steps:
acquiring information of the HTML tag corresponding to the web application;
judging whether an HTML page corresponding to the HTML tag changes or not according to the information of the HTML tag;
and if the HTML page is judged to be changed, determining that the web application is changed.
Further, the information of the HTML tag includes a plurality of HTML tags; the processor 501 is configured to perform the following steps:
generating a structure tree of the HTML labels according to the HTML labels;
determining a difference value between the structural tree of the HTML tag and the structural tree of the HTML tag corresponding to the web application recorded in the historical record at the last time;
and if the difference value is larger than a preset threshold value, determining that the HTML page changes.
Further, the processor 501 is configured to perform the following steps:
acquiring a structure difference value and a tag difference value of the structure tree of the HTML tag and the structure tree of the HTML tag corresponding to the web application recorded in the historical record at the last time;
wherein the structure difference value comprises: difference values of the levels of the structure tree or difference of the number of child nodes of the structure tree; the tag difference values include: any one or more of a difference value of the number of tags, a difference value of the ID of the tag and a difference value of the class of the tag;
and determining the difference value between the structure tree of the HTML label and the structure tree of the HTML label corresponding to the web application recorded in the history record at the latest time according to the structure difference value, the preset structure difference value weight and the label difference value and the preset label difference value weight.
Further, the processor 501 is configured to perform the following steps:
according to a preset similarity algorithm, determining the similarity between the content information and the content information obtained by scanning the web application in the historical record for the last time;
judging whether the similarity is greater than a preset threshold value or not;
if the similarity is judged to be larger than the preset threshold value, determining that the web application is not changed;
and if the similarity is judged to be less than or equal to the preset threshold value, determining that the web application is changed.
Further, the content information includes text information and path information; the processor 501 is configured to perform the following steps:
determining the text similarity between the text information and the text information obtained by scanning the web application for the last time in the history record according to a preset similarity algorithm;
determining the path similarity between the path information and the path information obtained by scanning the web application for the last time in the history record according to a preset similarity algorithm;
and determining the similarity of the content information according to the text similarity and the path similarity.
Further, the processor 501 is configured to perform the following steps:
and determining the similarity of the content information according to the text similarity, a preset text similarity weight and the path similarity and a preset path similarity weight.
In the embodiment of the invention, the terminal can analyze the acquired web application to obtain the web component information of the web application, if the web component information is judged to be changed, the web application is determined to be changed, if the web component information is judged not to be changed, the content information of the page corresponding to the web application is acquired, and when the web application is determined to be changed, a preset vulnerability scanning engine is called to carry out vulnerability detection on the changed web application, so that vulnerability detection on all the acquired web applications is avoided, and vulnerability detection efficiency is improved.
It should be understood that, in the embodiment of the present invention, the Processor 501 may be a Central Processing Unit (CPU), and the Processor may also be other general-purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The input device 502 may include a touch pad, a fingerprint sensor (for collecting fingerprint information of a user and direction information of the fingerprint), a microphone, etc., and the output device 503 may include a display (LCD, etc.), a speaker, etc.
The memory 504 may include a read-only memory and a random access memory, and provides instructions and data to the processor 501. A portion of the memory 504 may also include non-volatile random access memory. For example, the memory 504 may also store device type information.
In a specific implementation, the processor 501, the input device 502, and the output device 503 described in this embodiment of the present invention may execute the implementation described in the method embodiment shown in fig. 1, fig. 2, or fig. 3 of the vulnerability detection method for a web application provided in this embodiment of the present invention, or may execute the implementation of the terminal described in fig. 4 in this embodiment of the present invention, which is not described herein again.
The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the method for detecting a vulnerability of a web application described in the embodiment corresponding to fig. 1, fig. 2, or fig. 3 is implemented, and a terminal in the embodiment corresponding to fig. 4 or fig. 5 of the present invention may also be implemented, which is not described herein again.
The computer readable storage medium may be an internal storage unit of the terminal according to any of the foregoing embodiments, for example, a hard disk or a memory of the terminal. The computer readable storage medium may also be an external storage device of the terminal, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the terminal. Further, the computer-readable storage medium may also include both an internal storage unit and an external storage device of the terminal. The computer-readable storage medium is used for storing the computer program and other programs and data required by the terminal. The computer readable storage medium may also be used to temporarily store data that has been output or is to be output.
Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the examples have been described in a functional general in the foregoing description for the purpose of illustrating clearly the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention essentially or partially contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a terminal, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only a part of the embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the present invention, and these modifications or substitutions should be covered within the scope of the present invention.

Claims (8)

1. A vulnerability detection method of a web application is characterized by comprising the following steps:
analyzing the obtained web application to obtain web component information of the web application, wherein the web component information is entity information for encapsulating data and methods of the web application;
judging whether the web component information of the web application changes or not according to the web component information and historical web component information of the web application in a historical record, wherein the historical record stores the historical web component information of the web application;
if the web component information of the web application is judged to be changed, determining that the web application is changed;
if the web component information of the web application is judged not to be changed, acquiring content information of a page corresponding to the web application, and determining whether the web application is changed or not according to the content information, wherein the content information comprises text information and path information;
the determining whether the web application is changed according to the content information includes:
determining the text similarity between the text information and the text information obtained by scanning the web application for the last time in the history record according to a preset similarity algorithm;
determining the path similarity between the path information and the path information obtained by scanning the web application for the last time in the history record according to a preset similarity algorithm;
determining the similarity of the content information according to the text similarity and the path similarity;
judging whether the similarity is greater than a preset threshold value or not;
if the similarity is judged to be larger than the preset threshold value, determining that the web application is not changed;
if the similarity is judged to be smaller than or equal to the preset threshold value, determining that the web application changes;
and calling a preset vulnerability scanning engine to carry out vulnerability detection on the changed web application when the web application is determined to be changed.
2. The method of claim 1, wherein the content information includes information of a hypertext markup language (HTML) tag, and wherein the determining whether the web application is changed according to the content information includes:
acquiring information of the HTML tag corresponding to the web application;
judging whether an HTML page corresponding to the HTML tag changes or not according to the information of the HTML tag;
and if the HTML page is judged to be changed, determining that the web application is changed.
3. The method of claim 2, wherein the information of the HTML tag includes a plurality of HTML tags; the judging whether the HTML page corresponding to the HTML tag changes or not according to the information of the HTML tag comprises the following steps:
generating a structure tree of the HTML labels according to the HTML labels;
determining a difference value between the structural tree of the HTML tag and the structural tree of the HTML tag corresponding to the web application recorded in the historical record at the last time;
and if the difference value is larger than a preset threshold value, determining that the HTML page changes.
4. The method of claim 3, wherein determining a difference value between the structure tree of the HTML tag and the structure tree of the HTML tag corresponding to the web application recorded most recently in the history comprises:
acquiring a structure difference value and a tag difference value of the structure tree of the HTML tag and the structure tree of the HTML tag corresponding to the web application recorded in the historical record at the last time;
wherein the structure difference value comprises: difference values of the levels of the structure tree or difference of the number of child nodes of the structure tree; the tag difference values include: any one or more of a difference value of the number of tags, a difference value of the ID of the tag and a difference value of the class of the tag;
and determining the difference value between the structure tree of the HTML label and the structure tree of the HTML label corresponding to the web application recorded in the history record at the latest time according to the structure difference value, the preset structure difference value weight and the label difference value and the preset label difference value weight.
5. The method according to claim 1, wherein the determining the similarity of the content information according to the text similarity and the path similarity comprises:
and determining the similarity of the content information according to the text similarity, a preset text similarity weight and the path similarity and a preset path similarity weight.
6. A terminal, characterized in that it comprises means for performing the method according to any of claims 1-5.
7. A terminal, comprising a processor, an input device, an output device, and a memory, the processor, the input device, the output device, and the memory being interconnected, wherein the memory is configured to store a computer program comprising program instructions, the processor being configured to invoke the program instructions to perform the method of any of claims 1-5.
8. A computer-readable storage medium, characterized in that the computer storage medium stores a computer program comprising program instructions that, when executed by a processor, cause the processor to perform the method according to any of claims 1-5.
CN201810854861.4A 2018-07-27 2018-07-27 Vulnerability detection method of web application, terminal and computer readable medium Active CN109167757B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201810854861.4A CN109167757B (en) 2018-07-27 2018-07-27 Vulnerability detection method of web application, terminal and computer readable medium
PCT/CN2018/108673 WO2020019511A1 (en) 2018-07-27 2018-09-29 Web application vulnerability detection method, terminal, and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810854861.4A CN109167757B (en) 2018-07-27 2018-07-27 Vulnerability detection method of web application, terminal and computer readable medium

Publications (2)

Publication Number Publication Date
CN109167757A CN109167757A (en) 2019-01-08
CN109167757B true CN109167757B (en) 2021-05-11

Family

ID=64898689

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810854861.4A Active CN109167757B (en) 2018-07-27 2018-07-27 Vulnerability detection method of web application, terminal and computer readable medium

Country Status (2)

Country Link
CN (1) CN109167757B (en)
WO (1) WO2020019511A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102457500A (en) * 2010-10-22 2012-05-16 北京神州绿盟信息安全科技股份有限公司 Website scanning equipment and method
CN103095681A (en) * 2012-12-03 2013-05-08 微梦创科网络科技(中国)有限公司 Loophole detection method and device
CN106209487A (en) * 2015-05-07 2016-12-07 阿里巴巴集团控股有限公司 For detecting the method and device of the security breaches of webpage in website
CN106951784A (en) * 2017-02-23 2017-07-14 南京航空航天大学 A kind of Web application conversed analysis methods towards XSS Hole Detections
CN107846413A (en) * 2017-11-29 2018-03-27 济南浪潮高新科技投资发展有限公司 A kind of method and system for defending cross-site scripting attack

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102404281B (en) * 2010-09-09 2014-08-13 北京神州绿盟信息安全科技股份有限公司 Website scanning device and method
JP5522850B2 (en) * 2010-11-10 2014-06-18 京セラコミュニケーションシステム株式会社 Vulnerability diagnostic device
CN108063759B (en) * 2017-12-05 2022-08-16 西安交大捷普网络科技有限公司 Web vulnerability scanning method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102457500A (en) * 2010-10-22 2012-05-16 北京神州绿盟信息安全科技股份有限公司 Website scanning equipment and method
CN103095681A (en) * 2012-12-03 2013-05-08 微梦创科网络科技(中国)有限公司 Loophole detection method and device
CN106209487A (en) * 2015-05-07 2016-12-07 阿里巴巴集团控股有限公司 For detecting the method and device of the security breaches of webpage in website
CN106951784A (en) * 2017-02-23 2017-07-14 南京航空航天大学 A kind of Web application conversed analysis methods towards XSS Hole Detections
CN107846413A (en) * 2017-11-29 2018-03-27 济南浪潮高新科技投资发展有限公司 A kind of method and system for defending cross-site scripting attack

Also Published As

Publication number Publication date
CN109167757A (en) 2019-01-08
WO2020019511A1 (en) 2020-01-30

Similar Documents

Publication Publication Date Title
CN110275958B (en) Website information identification method and device and electronic equipment
CN103095681B (en) A kind of method and device detecting leak
US9531751B2 (en) System and method for identifying phishing website
CN109243619B (en) Generation method and device of prediction model and computer readable storage medium
US20150324478A1 (en) Detection method and scanning engine of web pages
US20130042306A1 (en) Determining machine behavior
CN108898009A (en) A kind of anti-crawler method, terminal and computer-readable medium
CN113542442B (en) Malicious domain name detection method, device, equipment and storage medium
CN111415683A (en) Method and device for alarming abnormality in voice recognition, computer equipment and storage medium
CN111552635A (en) Data detection method, equipment, server and readable storage medium
CN108509228B (en) Page loading method, terminal equipment and computer readable storage medium
CN110704721A (en) Client data processing method and device, terminal equipment and readable storage medium
CN109167757B (en) Vulnerability detection method of web application, terminal and computer readable medium
CN110457900B (en) Website monitoring method, device and equipment and readable storage medium
CN109067738B (en) Port vulnerability detection method, terminal and computer readable medium
CN113094283A (en) Data acquisition method, device, equipment and storage medium
CN111125704A (en) Webpage Trojan horse recognition method and system
CN111782991A (en) Method, device, equipment and storage medium for detecting abnormal hidden link of website
CN111460448A (en) Malicious software family detection method and device
CN116069324A (en) Dynamic form construction method and device based on Vue
CN115658646A (en) Binary characteristic database construction method and device
CN113217826B (en) Pipeline water supply pipe network leakage alarm control method, device and medium
CN106790271A (en) A kind of detection method of sensitive data, device, computer-readable recording medium and storage control
WO2016134637A1 (en) Bar code recognition method and device
CN109067726B (en) Identification method and device for station building system, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant