CN110334301B - Page restoration method and device - Google Patents

Page restoration method and device Download PDF

Info

Publication number
CN110334301B
CN110334301B CN201810234123.XA CN201810234123A CN110334301B CN 110334301 B CN110334301 B CN 110334301B CN 201810234123 A CN201810234123 A CN 201810234123A CN 110334301 B CN110334301 B CN 110334301B
Authority
CN
China
Prior art keywords
page
hijacking
link
link hijacking
terminal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810234123.XA
Other languages
Chinese (zh)
Other versions
CN110334301A (en
Inventor
夏雄风
张波
张小龙
胡育辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Tencent Computer Systems Co Ltd
Original Assignee
Shenzhen Tencent Computer Systems Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Tencent Computer Systems Co Ltd filed Critical Shenzhen Tencent Computer Systems Co Ltd
Priority to CN201810234123.XA priority Critical patent/CN110334301B/en
Publication of CN110334301A publication Critical patent/CN110334301A/en
Application granted granted Critical
Publication of CN110334301B publication Critical patent/CN110334301B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • G06F16/9566URL specific, e.g. using aliases, detecting broken or misspelled links
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The embodiment of the invention provides a page restoration method and device, which relate to the technical field of information security, and the method comprises the following steps: and receiving the page data of the link hijacking page sent by the terminal, wherein the page data of the link hijacking page is sent by the terminal when the link hijacking occurs in the page, then restoring the link hijacking page according to the page data, and then determining the position where the link hijacking occurs in the link hijacking page according to the restored link hijacking page. The link hijacking page data is obtained from the terminal, and the link hijacking site is restored according to the page data, so that screenshot and evidence obtaining of the link hijacking site are not needed to be carried out manually, and the efficiency of evidence obtaining of the link hijacking site is improved. And then, according to the restored link hijacking page, the specific position of the link hijacking in the page can be determined, so that the link hijacking in the page can be intuitively analyzed and risk assessment can be carried out.

Description

Page restoration method and device
Technical Field
The embodiment of the invention relates to the technical field of information security, in particular to a page restoration method and device.
Background
Link hijacking refers to that an operator (or hacker) inserts malicious codes or uniform resource locators (Uniform Resource Locator, abbreviated as URLs) into web pages on a network transmission physical link to achieve the purposes of inserting advertisements, replacing original advertisements and stealing user information. The content inserted by the operator (or hacker) is completely uncontrollable, affecting the user experience and enterprise image. In addition, the normal advertisement is replaced, resulting in a decrease in the exposure of the normal advertisement.
The site where the link hijacking occurs is often only visible to the affected user, which presents a great difficulty in forensics. In the prior art, when a user finds that a link hijacking occurs on an accessed page, a hijacked page screenshot is saved and is connected with a service party, and the service party remotely grabs the flow during the hijacking through wireshark and other tools, so that the analysis is that the hijacking occurs on the link. The method is too dependent on a user to evidence the link hijacking site, and has low evidence collecting efficiency.
Disclosure of Invention
The embodiment of the invention provides a page reduction method and device.
In one aspect, an embodiment of the present invention provides a page restore method, where the method includes: and receiving the page data of the link hijacking page sent by the terminal, wherein the page data of the link hijacking page is sent by the terminal when the link hijacking occurs in the page, restoring the link hijacking page according to the page data of the link hijacking page, and determining the position of the link hijacking in the link hijacking page according to the restored link hijacking page.
In one possible design, before the link hijacking page is restored according to the page data of the link hijacking page, the page data of the repeated link hijacking page in the category to which the link hijacking page belongs can be filtered first, so that the repeated link hijacking page is prevented from being restored, and the efficiency of restoring the link hijacking page is improved.
In one possible design, the category of the link hijacking page may be determined according to the attribute information of the link hijacking page, or may be determined according to the attribute information of the link hijacking page and hijacking information contained in the link hijacking page.
In one possible design, the escape character in the page data of the link hijacking page may be adjusted first, the storage path corresponding to the page data of the link hijacking page may be modified, and then the link hijacking page may be rendered according to the adjusted escape character and the page data of the link hijacking page after the storage path is modified.
On the other hand, the embodiment of the invention provides a page restoration device, which comprises a receiving module, a restoration module and a processing module.
The receiving module is used for receiving the page data of the link hijacking page sent by the terminal, wherein the page data of the link hijacking page is sent by the terminal when the link hijacking of the page is determined;
The restoring module is used for restoring the link hijacking page according to the page data of the link hijacking page;
and the processing module is used for determining the position of the link hijacking in the link hijacking page according to the restored link hijacking page.
In one possible design, the processing module is further configured to filter the page data of the link hijacking page repeated in the category to which the link hijacking page belongs before the link hijacking page is restored according to the page data of the link hijacking page.
In one possible design, the category of the link hijacking page may be determined according to the attribute information of the link hijacking page, or may be determined according to the attribute information of the link hijacking page and hijacking information contained in the link hijacking page.
In one possible design, the processing module is specifically configured to adjust escape characters in the page data of the link hijacking page, and modify a storage path corresponding to the page data of the link hijacking page; and then rendering the link hijacking page according to the adjustment escape character and the page data of the link hijacking page after the storage path is modified.
In another aspect, an embodiment of the present invention provides a terminal device, including at least one processing unit and at least one storage unit, where the storage unit stores a computer program, and when the program is executed by the processing unit, causes the processing unit to execute the steps of the method described in the foregoing aspect.
In yet another aspect, an embodiment of the present invention provides a computer readable storage medium storing a computer program executable by a terminal device, which when run on the terminal device causes the terminal device to perform the steps of the method of the above aspect.
In the embodiment of the invention, when the terminal detects that the page is hijacked by the link, the terminal acquires the page data of the link hijacked page, restores the link hijacked page according to the page data of the link hijacked page, and then determines the position of the link hijacked page according to the restored link hijacked page. The terminal detects whether the page is hijacked by the link or not and feeds back the page data of the page hijacked by the link when the link is hijacked, so that the user can timely find that the page is hijacked by the link without artificial judgment, and the efficiency of detecting the link hijacking is improved. In addition, the link hijacking page data is obtained from the terminal, and the link hijacking site is restored according to the page data, so that screenshot and evidence obtaining of the link hijacking site are not needed to be carried out manually, and the efficiency of evidence obtaining of the link hijacking site is improved. In addition, the position of the link hijacking in the page can be determined according to the restored link hijacking page, so that the link hijacking in the page can be intuitively analyzed and risk assessment can be performed.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments will be briefly described below, it will be apparent that the drawings in the following description are only some embodiments of the present invention, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a diagram of a system architecture according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of a page restore method according to an embodiment of the present invention;
fig. 3a is a schematic diagram of a link hijacking page according to an embodiment of the present invention;
Fig. 3b is a schematic diagram of a link hijacking page according to an embodiment of the present invention;
Fig. 4 is a flow chart of a link hijacking page display method according to an embodiment of the present invention;
FIG. 5 is a schematic flow chart of a page restore method according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a page restoration device according to an embodiment of the present invention;
Fig. 7 is a schematic structural diagram of a terminal device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantageous effects of the present invention more apparent, the present invention will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Fig. 1 schematically illustrates a system architecture to which an embodiment of the present invention is applied, and as shown in fig. 1, the system architecture to which an embodiment of the present invention is applied includes at least one terminal 110, at least one service side server 120, and an analysis server 130.
The terminal 110 is an electronic device with web browsing capability, which is a smart phone, a tablet computer, or a portable personal computer, etc.
After receiving the page request sent by the terminal 110, the service side server 120 returns page data corresponding to the page request to the terminal 110. The terminal 110 renders and displays a page according to the received page data. The terminal 110 is connected to the server 120 through a wired or wireless network.
The analysis server 130 is a page restoration device, receives the page data of the link hijacking page sent by the terminal 110, and restores the link hijacking site according to the page data of the link hijacking page. The analysis server 130 is a server, a server cluster formed by a plurality of servers, or a cloud computing center. The analysis server 130 is connected to the terminal 110 through a wired or wireless network.
In the embodiment of the present invention, in order to enable the terminal 110 to interact with the service server 120, the link hijacking state of the page responded by the service server 120 is determined by the terminal 110. In one possible implementation, the terminal 110 installs a browser application program, and the user submits after inputting a website in the browser application program, and the terminal 110 sends a page request to the service server 120 corresponding to the website through the browser application program. The service side server 120 returns page data corresponding to the page request to the terminal 110, and returns a link hijacking detection js and a page URL whitelist, where the page URL whitelist is a URL list included in a normal page corresponding to the website. The link hijacking detection js is pre-installed on the service side server 120 corresponding to the site needing to monitor the link hijacking state, and the service side server 120 sends the link hijacking detection js and the page data to the terminal 110 only when the terminal 110 requests the page data from the service side server 120. After acquiring the page data, the terminal 110 renders and displays the page according to the page data. And meanwhile, running a link hijacking detection js, and scanning the URL in the displayed page to acquire the URL in the currently displayed page. The scanned URL is then compared to the page URL whitelist. When the scanned URL does not match the page URL whitelist, it is determined that the link hijacking occurred on the page to which the service side server 120 responds.
Alternatively, in order to restore the link hijacking page, when detecting that the link hijacking occurs on the page responded by the service side server 120, the terminal 110 sends the page data of the link hijacking page to the analysis server 130. The analysis server 130 restores the link hijacking page according to the page data of the link hijacking page, and then determines the position where the link hijacking occurs in the link hijacking page according to the restored link hijacking page. Further, the analysis server 130 may analyze and risk evaluate the link hijacking in the page according to the restored link hijacking page, and then send the evaluation result and the screenshot of the restored link hijacking page to the staff of the service party, so that the staff of the service party processes the link hijacking in the page.
Alternatively, the wireless network or wired network described above uses standard communication techniques and/or protocols. The network is typically the Internet, but may be any network including, but not limited to, a local area network (Local Area Network, LAN), metropolitan area network (Metropolitan Area Network, MAN), wide area network (Wide Area Network, WAN), mobile, wired or wireless network, private network, or any combination of virtual private networks.
Referring to fig. 2, a flowchart of a page restore method provided by an embodiment of the present invention is illustrated by using the page restore method for the system architecture shown in fig. 1 as an example, where the method includes:
step S201, the analysis server receives the page data of the link hijacking page sent by the terminal, wherein the page data of the link hijacking page is sent by the terminal when the link hijacking of the page is determined.
In one possible implementation, the terminal sends a web page request to the service side server, where the web page request may be a GET/POST request, the request is a request of an http protocol, GET represents acquiring data from the service side server, and the POST identifier sends the data to the service side server. For the sites needing to detect the link hijacking state, setting the link hijacking detection js in the corresponding service side server in advance, and simultaneously pre-storing a page URL white list of the sites needing to detect the link hijacking. And after receiving the web page request sent by the terminal, the service side server returns the web page information carrying the link hijacking detection js and the page URL white list to the terminal. And after receiving the web page information, the terminal renders and displays the web page according to the web page information, and simultaneously operates a link hijack to detect js to scan the displayed web page, so as to acquire the URL in the web page. Optionally, the URL present in the web page includes the following manifestations: text-type URLs (mainly for static pages) and URLs packaged by js algorithm (mainly for dynamic pages). And comparing the URL in the web page with the page URL white list, determining the web page as a link hijacking page when the URL in the web page is not matched with the page URL white list, and sending the page data of the link hijacking page to an analysis server.
Optionally, the page data of the link hijacking page at least includes page display data of the link hijacking page, attribute information of the link hijacking page, and hijacking information contained in the link hijacking page. Specifically, the attribute information of the link hijacking page may be a URL of the link hijacking page, a user account in the link hijacking page, and the like. The hijacking information contained in the link hijacking page may be malicious URLs, malicious URL types, etc. in the link hijacking page, wherein the malicious URL types include IMAGE, SCRIPT, IFRAME, EMBED, FRAMES, UNDEFINED, etc. The following describes, in a specific embodiment, attribute information of a link hijacking page and hijacking information included in the link hijacking page:
In one possible implementation manner, the setting terminal determines that the link hijacking occurs in the messenger news page through the link hijacking detection js, compares the URL in the messenger news page with the page URL white list, and determines that the malicious URL in the messenger news page is: www.123.com, the malicious URL type is IMAGE. The attribute information of the news page and hijacking information contained in the news page are specifically shown in table 1:
TABLE 1
In another possible implementation manner, the terminal is set to determine that the link hijacking occurs on the Tencer game page through the link hijacking detection js, and the user account number included in the Tencer game page is 12345678. Comparing the URL in the Tencel game page with the page URL white list to determine that the malicious URL in the Tencel game page is: www.ABC.com, the malicious URL is a SCRIPT, and the attribute information of the Tencel game page and hijack information contained in the Tencel game page are specifically shown in table 2:
TABLE 2
Step S202, the analysis server restores the link hijacking page according to the page data of the link hijacking page.
Since multiple terminals in the same area may access a page, when a link hijacking occurs in a page, the terminal accessing the page will send the page data of the page to the analysis server. If the analysis server restores and analyzes the received page data of each page, repeated restoration and analysis of the same link hijacking page will result in resource waste and efficiency reduction. In order to avoid that the analysis server repeatedly restores the same link hijacking page, in the embodiment of the invention, before the link hijacking page is restored according to the page data of the link hijacking page, page data of repeated link hijacking pages in the category to which the link hijacking page belongs are filtered, wherein the repeated link hijacking page can be the link hijacking page with repeated attribute information of the link hijacking page, can be the link hijacking page with repeated hijacking information contained in the link hijacking page, and can also be the link hijacking page with repeated attribute information of the link hijacking page and the contained hijacking information. For example, the terminal a, the terminal B, the terminal C, and the terminal D all send the page data of the link hijacking page to the analysis server in a set period, and if the attribute information and the contained hijacking information of the link hijacking page sent by the terminal a, the terminal B, the terminal C, and the terminal D are the same, the page data of one link hijacking page is reserved in the received page data of the 4 link hijacking pages. Because the page data of the repeated link hijacking page in the category to which the link hijacking page belongs is filtered before the link hijacking page is restored, the analysis server is prevented from repeatedly restoring the same link hijacking site, the efficiency of link hijacking evidence obtaining is improved, and resource waste is avoided.
In one possible implementation, the category of the link hijacking page may be determined according to attribute information of the link hijacking page. Specifically, the category of the link hijacking page may be determined according to the URL of the link hijacking page, or determined according to the user account in the link hijacking page, or determined according to the URL of the link hijacking page and the user account in the link hijacking page. The following description will take the determination of the category of the link hijacking page according to the URL of the link hijacking page as an example. Setting the sites to be monitored including news, games, videos and finance, the link hijack pages can be classified into 4 categories, namely news, games, videos and finance by using page URLs as category attributes, as shown in table 3:
TABLE 3 Table 3
When the page data of the link hijacking page sent by the terminal is received, the URL of the link hijacking page is judged to be news, qq.com, and the category of the link hijacking page can be determined to be news according to the table 3.
In one possible implementation, the category of the link hijacking page may be determined according to hijacking information contained in the link hijacking page. Specifically, the category of the link hijacking page may be determined according to a malicious URL in the link hijacking page, or according to a malicious URL type, or according to a malicious URL in the link hijacking page and a malicious URL type. The following description will take the case of determining the category of the link hijacking page according to the malicious URL type in the link hijacking page. Setting that the common malicious URL types include IMAGE, SCRIPT, IFRAME, EMBED, FRAMES and UNDEFINED, the link hijacking page pairs are classified into 0,1, 2, 3, 4 and 5 by taking the malicious URL types as category attributes. As shown in table 4:
TABLE 4 Table 4
When the page data of the link hijacking page sent by the terminal is received, and the malicious URL type contained in the link hijacking page is judged to be IMAGE, the category of the link hijacking page can be determined to be 0 according to the table 4.
In another possible implementation manner, the category of the link hijacking page may be determined according to attribute information of the link hijacking page and hijacking information contained in the link hijacking page. Specifically, the category of the link hijacking page may be determined according to the link hijacking page URL and the malicious URL in the link hijacking page, or determined according to the link hijacking page URL and the malicious URL type, or determined according to the user account in the link hijacking page and the malicious URL in the link hijacking page, or determined according to the link hijacking page URL, the user account in the link hijacking page and the malicious URL type, or determined according to the link hijacking page URL, the malicious URL in the link hijacking page and the malicious URL type, or determined according to the link hijacking page URL, the link hijacking page user account, the link hijacking page malicious URL and the malicious URL type, or determined according to the link hijacking page URL, the link hijacking page user account, the link hijacking page and the malicious URL type. The following description will take the link hijacking page URL and the malicious URL type as examples. Setting that the sites to be monitored comprise news-making news and games-making news-making, and common malicious URL types comprise IMAGE, SCRIPT, IFRAME, EMBED, FRAMES and UNDEFINED, the link hijacking page URL and the malicious URL types are combined to classify the link hijacking page pairs into 12 types, as shown in Table 5:
TABLE 5
When the page data of the link hijacking page sent by the terminal is received, the link hijacking page URL is judged to be news.qq.com, the type of malicious URL contained in the link hijacking page is UNDEFINED, and then the category of the link hijacking page can be determined to be 6 according to the table 5. Because the page data of the link hijacking page is classified, repeated page data in the same class is filtered, and then the link hijacking page is restored, the efficiency of restoring the link hijacking page is improved. In addition, after the link hijacking pages of all the categories are restored, corresponding analysis can be carried out on each type of link hijacking page, so that the accuracy of risk assessment on the link hijacking is improved.
In specific implementation, the analysis server may configure virtual filters corresponding to each category of the link hijacking page in the management configuration file, and filter page data of the link hijacking page of each category through the virtual filters. For example, when the link hijacking page is classified into 12 categories according to the link hijacking page URL and the malicious URL type, 12 virtual filters can be correspondingly configured in the analysis server. The configuration code of the virtual filter is as follows:
[FILTERS]
# list the number of virtual filters that need to be applied
filter_cnt=12
# Malicious URL type definition
#IMAGE=0;
#SCRIPT=1;
#IFRAME=2;
#EMBED=3;
#FRAMES=4;
#UNDEFINED=5;
Configuration information of # virtual filter 1
Domain_0=news.qq.com# link hijacking page URL is set to news.qq.com
User account in uin_0= # link hijacking page is set to null
Bad_type_0=0# malicious URL type is set to IMAGE
Malicious URLs in the bad URL 0= # link hijacking page are set to null
The configuration manner of the other 11 virtual filters is the same as that of the virtual filter 1, and will not be described here again. After receiving the page data of the link hijacking page sent by the terminal, the analysis server inputs the received page data of the link hijacking page into the configured 12 virtual filters. The virtual filter corresponding to the category to which the link hijacking page belongs outputs the page data file of the link hijacking page, and the naming format adopted by the page data file can be: the link hijacking page url_malicious URL type_malicious url_random number. For example, the analysis server obtains page data of a news page from the terminal, wherein a malicious URL in the news page is tt.ab.com, and the type of the malicious URL is IMAGE. Then, the page data of the news page is input to the 12 virtual filters, the virtual filter 1 outputs the page data file of the news page, the file is named as news.qq.com_image_tt.ab.com_1.Html, and the rest 11 virtual filters do not output the page data file of the news page.
Further, before the link hijacking page is restored according to the page data of the link hijacking page, when the page data of the link hijacking page repeated in the category to which the link hijacking page belongs is filtered, filtering can be performed according to the file name of the page data file of the link hijacking page. For example, in a set period of time, the virtual filter 1 outputs page data files of 5 link hijacking pages, and file names are news.qq.com_IMAGE_tt.ab.com_1.html、news.qq.com_IMAGE_tt.ab.com_2.html、news.qq.com_IMAGE_tt.ab.com_3.html、news.qq.com_IMAGE_12.ab.com_1.html、news.qq.com_IMAGE_12.ab.com_2.html, respectively, so that the page data files of repeated link hijacking pages can be filtered by taking a link hijacking page URL, a malicious URL type and a malicious URL as key values, and the two page data files of the filtered link hijacking pages are respectively news.qq.com_image_tt.ab.com_1.Html and news.qq.com_image_12.ab.com_1.Html. It should be noted that, when filtering the page data file of the link hijacking page, the reserved page data file may be designated from the repeated file according to the actual situation, or may randomly reserve a page data file from the repeated file. Optionally, after receiving the page data of the link hijacking page sent by the terminal, the received page data of each link hijacking page may be respectively input into virtual filters corresponding to each category, or the page data of the link hijacking page received in a period of time may be input into virtual filters corresponding to each category.
Step S203, the analysis server determines the position of the link hijacking in the link hijacking page according to the restored link hijacking page.
Specifically, the link hijacking page restored by the analysis server is a page displayed when the link hijacking occurs on the terminal, for example, as shown in fig. 3a, if the picture corresponding to the news 2 in the restored page is inconsistent with the content of the news 2, the position where the link hijacking occurs in the hijacking link page shown in fig. 3a is the position corresponding to the news 2 picture, that is, the hijacking advertisement picture in fig. 3 a. For another example, as shown in fig. 3b, in the restored news page, some illegal information or bad information appears in the picture 1 selected by the upper right corner frame, and the position where the news page is hijacked by the link is the picture 1.
Further, risk assessment can be performed on the link hijacking generated on the page according to the content of the link hijacking, for example, the risk level of the link hijacking is set to be three levels of high, medium and low in advance, and when the content of the hijacking advertisement picture shown in fig. 3a is legal and the user account information cannot be stolen, the risk level of the hijacking advertisement picture is determined to be low; when the content of the hijacking advertisement picture shown in fig. 3a is illegal and the user account information cannot be stolen, determining the risk level of the hijacking advertisement picture as medium; when the content of the hijacked advertisement picture shown in fig. 3a is illegal and the user account information is stolen, the risk level of the hijacked advertisement picture is determined to be high. For the link hijacking page with high risk level of the link hijacking, the screenshot of the link hijacking page and the risk assessment result can be pushed to staff of the business party through communication tools such as mails or WeChat and the like at the first time. For the link hijacking page with the medium and low risk level of the link hijacking, the screenshot of the link hijacking page and the risk assessment result can be pushed to the staff of the service party through communication tools such as mails or WeChats at regular intervals, and the screenshot of the link hijacking page and the risk assessment result can be pushed to the staff of the service party through communication tools such as mails or WeChats when the frequency of the link hijacking of the page reaches a preset threshold value, so that the staff of the service party can process the link hijacking of the page.
Optionally, the link hijacking page of each category may be analyzed to determine a location in the link hijacking page where link hijacking occurs frequently. And then pushing the screenshot of the position where the link hijacking frequently occurs and the link hijacking page corresponding to each position to the staff of the service party through communication tools such as mails or WeChats, so that the staff of the service party can process the position where the link hijacking frequently occurs.
When the terminal detects that the link hijacking occurs on the page responded by the service side server, the terminal sends the page data of the link hijacking page to the analysis server without manual intervention, so that the analysis server can timely evidence the page with the link hijacking under the condition that normal service of a user is not affected, and the efficiency of coping with the link hijacking is improved. And secondly, restoring the page when the link hijacking occurs on the terminal according to the page data of the link hijacking page, so that the position of the page when the link hijacking occurs can be intuitively determined, the overall evaluation of the risk brought by the link hijacking is facilitated, and the accuracy of the link hijacking evaluation is improved.
Optionally, the terminal may forward through the intermediate server in the process of sending the page data of the link hijacking page to the analysis server, so that the page data (such as escape characters) of the link hijacking page is modified to a certain extent in the forwarding process. After the analysis server receives the page data of the link hijacking page, the path of the page data storage is inconsistent with the path stored on the terminal, so the analysis server cannot render the page directly according to the page data. In order to ensure that the analysis server can render and restore the link hijacking page, fig. 4 exemplarily shows a flow of a link hijacking page display method provided by an embodiment of the present invention, including the following steps:
step S401, adjusting escape characters in page data of the link hijacking page, and modifying a storage path corresponding to the page data of the link hijacking page.
Specifically, when the escape character in the page data of the link hijacking page is adjusted, the adopted adjustment rule is as follows: and accessing the page data of the link hijacking page by adopting res, and adopting an adjustment statement res=re.sub (r 'a;', 'b', res) to adjust escape characters in the page data of the link hijacking page. Such as:
res=re (r ", res) # the escape character" & nbsp "is adjusted to an escape character" ".
Res=re (r "<", "<", res) # the escape character "<" is adjusted to a space escape character "<";
res=re (r ">, res) # adjusts escape character" & gt "to space escape character" > ".
Res=re (r "&", res) # the escape character "& amp" is adjusted to a space escape character "&"
Res=re (r "", "\" ", res) # adjusts the escape character" & quot "to a space escape character" \ ".
In addition, when the storage path corresponding to the page data of the link hijacking page is modified, the relative path with respect to the domain name may be replaced with the absolute path with respect to the domain name. The above adjustment statement may also be used to modify the storage path, for example, replace the relative path "/web2010/css/main.css" with the absolute path "http:// jkyx.qq.com/web2010/css/main.css", i.e. replace < link type= "text/css" rel= "stylesheet" = "/web2010/css/main.css" > "be cs < link type=" text/css "rel=" stylesheet "href" > "htp:// jkyx.qq.com/web2010/css/main.css" >, where the corresponding adjustment statement is:
# websitePrefix is a website prefix
res=re.sub(r"src=\"/","src=\""+websitePrefix,res)
Step S402, the link hijacking page is rendered according to the adjustment escape character and the page data of the link hijacking page after the storage path is modified.
The link hijacking page is rendered and displayed in a local browser by adjusting transfer characters in page data of the link hijacking page and modifying a storage path corresponding to the page data of the link hijacking page, so that evidence collection of the link hijacking site is realized.
Based on the system architecture diagram shown in fig. 1, the embodiment of the present invention further provides another page restore method, where the flow of the method may be interactively executed by a terminal, a service side server, and an analysis server, as shown in fig. 5:
In step S501, the terminal sends a page request to the service server.
In step S502, the service server returns the page data carrying the link hijacking detection js and the page URL whitelist to the terminal.
In step S503, the terminal renders and displays the page according to the page data.
Step S504, the terminal runs the link hijacking detection js to scan the displayed page, and acquires the URL in the page.
In step S505, the terminal compares the URL existing in the page with the page URL whitelist.
Step S506, when the terminal determines that the URL existing in the page is not matched with the page URL white list, the page is determined to be a link hijacking page.
Step S507, the terminal sends the page data of the link hijacking page to the analysis server.
Step S508, the analysis server determines the category of the link hijacking page according to the attribute information of the link hijacking page and the hijacking information contained in the link hijacking page.
Step S509, the analysis server filters the page data of the repeated link hijacking page in the category to which the link hijacking page belongs.
Step S510, the analysis server restores the link hijacking page according to the page data of the link hijacking page.
In step S511, the analysis server determines the location of the link hijacking in the link hijacking page according to the restored link hijacking page.
In the embodiment of the invention, when the terminal detects that the page is hijacked by the link, the terminal acquires the page data of the link hijacked page, restores the link hijacked page according to the page data of the link hijacked page, and then determines the position of the link hijacked page according to the restored link hijacked page. The terminal detects whether the page is hijacked by the link or not and feeds back the page data of the page hijacked by the link when the link is hijacked, so that the user can timely find that the page is hijacked by the link without artificial judgment, and the efficiency of detecting the link hijacking is improved. In addition, the link hijacking page data is obtained from the terminal, and the link hijacking site is restored according to the page data, so that screenshot and evidence obtaining of the link hijacking site are not needed to be carried out manually, and the efficiency of evidence obtaining of the link hijacking site is improved. In addition, the position of the link hijacking in the page can be determined according to the restored link hijacking page, so that the link hijacking in the page can be intuitively analyzed and risk assessment can be performed.
Based on the same technical concept, the embodiment of the present invention provides a page restore device, as shown in fig. 6, which is implemented as all or part of the analysis server 130 in fig. 1 through hardware or a combination of hardware and software. The apparatus 600 includes: a receiving module 601, a restoring module 602 and a processing module 603.
The receiving module 601 is configured to receive page data of a link hijacking page sent by a terminal, where the page data of the link hijacking page is sent by the terminal when determining that the link hijacking occurs on the page;
the restoring module 602 is configured to restore the link hijacking page according to the page data of the link hijacking page;
And the processing module 603 is configured to determine a location in the link hijacking page where the link hijacking occurs according to the restored link hijacking page.
In one possible design, the processing module 603 is further configured to filter the page data of the link hijacking page repeated in the category to which the link hijacking page belongs before restoring the link hijacking page according to the page data of the link hijacking page.
In one possible design, the category of the link hijacking page may be determined according to the attribute information of the link hijacking page, or may be determined according to the attribute information of the link hijacking page and hijacking information contained in the link hijacking page.
In one possible design, the processing module 603 is specifically configured to adjust escape characters in the page data of the link hijacking page, and modify a storage path corresponding to the page data of the link hijacking page; and then rendering the link hijacking page according to the adjustment escape character and the page data of the link hijacking page after the storage path is modified.
The embodiment of the invention also provides a terminal device, which comprises at least one processing unit and at least one storage unit, wherein the storage unit stores a computer program, and when the program is executed by the processing unit, the processing unit is caused to execute the steps of the page restore method.
Fig. 7 is a schematic diagram of a hardware structure of a terminal device according to an embodiment of the present invention, where the terminal device may be a desktop computer, a portable computer, a smart phone, a tablet computer, or the like. In particular, the terminal device may comprise a memory 701, a processor 702 and a computer program stored on the memory, the processor 702 implementing the steps of any of the page restore methods of the above embodiments when executing the program. The memory 701 may include a Read Only Memory (ROM) and a Random Access Memory (RAM), among other things, and provides program instructions and data stored in the memory 701 to the processor 702.
Further, the terminal device in the embodiment of the present application may further include an input device 703, an output device 704, and the like. Input devices 703 may include a keyboard, mouse, touch screen, etc.; the output device 704 may include a display apparatus such as a Liquid crystal display (Liquid CRYSTAL DISPLAY, LCD), a Cathode Ray Tube (CRT), a touch screen, or the like. The memory 701, the processor 702, the input device 703 and the output device 704 may be connected by a bus or otherwise, for example in fig. 7. The processor 702 calls the program instructions stored in the memory 701 and executes the page restore method provided in the above embodiment according to the obtained program instructions.
The embodiment of the invention also provides a computer readable storage medium storing a computer program executable by a terminal device, which when run on the terminal device causes the terminal device to perform the steps of the page restore method.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, or as a computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (7)

1. A method for page restore, comprising:
the terminal sends a web page request to a service side server;
The terminal receives the web page information carrying the link hijacking detection js and the page URL white list returned by the service side server;
the terminal renders and displays a web page according to the web page information, and runs the link hijacking detection js to scan the web page to obtain a URL (uniform resource locator) existing in the web page;
The terminal compares the URL existing in the web page with the page URL white list;
When the URL existing in the web page is not matched with the page URL white list, the terminal determines the web page as a link hijacking page, and the link hijacking points to the web page to insert the URL;
the analysis server configures virtual filters corresponding to each category of the link hijacking page in the management configuration file;
the analysis server receives the page data of the link hijacking page sent by the terminal, wherein the page data of the link hijacking page is sent by the terminal when the link hijacking of the page is determined;
the analysis server filters the page data of the repeated link hijacking page in the category to which the link hijacking page belongs through a virtual filter corresponding to the category to which the link hijacking page belongs;
The analysis server adjusts escape characters in the page data of the link hijacking page and modifies a storage path corresponding to the page data of the link hijacking page;
The analysis server renders the link hijacking page according to the page data of the link hijacking page after the escape character is adjusted and the storage path is modified;
the analysis server determines the position of the link hijacking in the link hijacking page according to the restored link hijacking page;
the analysis service analyzes and risk evaluates the link hijacking in the restored link hijacking page to obtain an evaluation result;
and the analysis service sends the evaluation result and the screenshot of the restored link hijacking page to staff of the service party.
2. The method of claim 1, wherein the repeated link hijacking page is a link hijacking page in which attribute information of the link hijacking page and contained hijacking information are repeated.
3. The method of claim 2, wherein the link hijacking page is of the category: determining according to the attribute information of the link hijacking page;
Or determining according to hijacking information contained in the link hijacking page;
Or determining according to the attribute information of the link hijacking page and hijacking information contained in the link hijacking page.
4. A page restore device, comprising:
The receiving module is used for receiving the page data of the link hijacking page sent by the terminal, wherein the page data of the link hijacking page is sent by the terminal when the link hijacking of the page is determined; the terminal is used for: sending a web page request to a service side server; receiving web page information carrying a link hijacking detection js and a page URL white list returned by the service side server; rendering and displaying a web page according to the web page information, and running the link hijacking detection js to scan the web page to obtain a URL (uniform resource locator) existing in the web page; comparing URLs present in the web page with the page URL whitelist; when the URL existing in the web page is not matched with the page URL white list, determining the web page as a link hijacking page, wherein the link hijacking points to the web page and inserts the URL;
The restoring module is used for configuring virtual filters corresponding to each category of the link hijacking page in the management configuration file; filtering the page data of repeated link hijacking pages in the category to which the link hijacking page belongs through a virtual filter corresponding to the category to which the link hijacking page belongs; adjusting escape characters in the page data of the link hijacking page, and modifying a storage path corresponding to the page data of the link hijacking page;
the analysis server renders the link hijacking page according to the page data of the link hijacking page after the escape character is adjusted and the storage path is modified;
The processing module is used for determining the position of the link hijacking in the link hijacking page according to the restored link hijacking page; analyzing and evaluating risk of the restored link hijacking in the link hijacking page to obtain an evaluation result; and sending the evaluation result and the restored screenshot of the link hijacking page to a staff of the service party.
5. The apparatus of claim 4, wherein the link hijacking page belongs to the category of: determining according to the attribute information of the link hijacking page;
Or determining according to the attribute information of the link hijacking page and hijacking information contained in the link hijacking page.
6. A terminal device comprising at least one processing unit and at least one storage unit, wherein the storage unit stores a computer program which, when executed by the processing unit, causes the processing unit to perform the steps of the method of any of claims 1-3.
7. A computer readable storage medium, characterized in that it stores a computer program executable by a terminal device, which program, when run on the terminal device, causes the terminal device to perform the steps of the method according to any of claims 1-3.
CN201810234123.XA 2018-03-21 2018-03-21 Page restoration method and device Active CN110334301B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810234123.XA CN110334301B (en) 2018-03-21 2018-03-21 Page restoration method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810234123.XA CN110334301B (en) 2018-03-21 2018-03-21 Page restoration method and device

Publications (2)

Publication Number Publication Date
CN110334301A CN110334301A (en) 2019-10-15
CN110334301B true CN110334301B (en) 2024-05-03

Family

ID=68138829

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810234123.XA Active CN110334301B (en) 2018-03-21 2018-03-21 Page restoration method and device

Country Status (1)

Country Link
CN (1) CN110334301B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112507271B (en) * 2020-12-14 2023-03-24 杭州趣链科技有限公司 Webpage evidence obtaining method, device and equipment
CN112631869B (en) * 2020-12-28 2023-01-17 深圳市彬讯科技有限公司 Page loading data monitoring method and device, computer equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103051722A (en) * 2012-12-26 2013-04-17 新浪网技术(中国)有限公司 Method and related equipment for determining whether page is hijacked or not
WO2017054731A1 (en) * 2015-09-30 2017-04-06 北京奇虎科技有限公司 Method and device for processing hijacked browser
WO2017054716A1 (en) * 2015-09-30 2017-04-06 北京奇虎科技有限公司 Method for recognizing hijacked browser and browser
CN107124430A (en) * 2017-06-08 2017-09-01 腾讯科技(深圳)有限公司 Pagejack monitoring method, device, system and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103051722A (en) * 2012-12-26 2013-04-17 新浪网技术(中国)有限公司 Method and related equipment for determining whether page is hijacked or not
WO2017054731A1 (en) * 2015-09-30 2017-04-06 北京奇虎科技有限公司 Method and device for processing hijacked browser
WO2017054716A1 (en) * 2015-09-30 2017-04-06 北京奇虎科技有限公司 Method for recognizing hijacked browser and browser
CN107124430A (en) * 2017-06-08 2017-09-01 腾讯科技(深圳)有限公司 Pagejack monitoring method, device, system and storage medium

Also Published As

Publication number Publication date
CN110334301A (en) 2019-10-15

Similar Documents

Publication Publication Date Title
US10257199B2 (en) Online privacy management system with enhanced automatic information detection
US20220030085A1 (en) Method, client, server, and system for sharing content
CN110413908B (en) Method and device for classifying uniform resource locators based on website content
CN104486140B (en) It is a kind of to detect device and its detection method that webpage is held as a hostage
KR100848319B1 (en) Harmful web site filtering method and apparatus using web structural information
CN106911693B (en) Method and device for detecting hijacking of webpage content and terminal equipment
EP3136656B1 (en) Information sharing method and device
CA2823530A1 (en) Online privacy management
CN108156121B (en) Traffic hijacking monitoring method and device and traffic hijacking alarm method and device
CN112703496B (en) Content policy based notification to application users regarding malicious browser plug-ins
CN103678372A (en) Method and equipment for obtaining application performance of page
CN110334301B (en) Page restoration method and device
US20120054598A1 (en) Method and system for viewing web page and computer Program product thereof
US11062019B2 (en) System and method for webpages scripts validation
EP2973192B1 (en) Online privacy management
CN108804501B (en) Method and device for detecting effective information
CN112087455B (en) WAF site protection rule generation method, system, equipment and medium
US11252135B2 (en) Method of processing data
CN112035205A (en) Data processing method, device, equipment and storage medium
CN111131236A (en) Web fingerprint detection device, method, equipment and medium
CN105389308A (en) Display processing method and device for web pages
CN112019377A (en) Method, system, electronic device and storage medium for network user role identification
CN114254218A (en) External link access acceleration method and device and computer storage medium
CN114465811B (en) Website login determination method and device, electronic equipment and storage medium
CN111368135B (en) Video sniffing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant