CN110046310B - Method and device for analyzing jump link in page - Google Patents

Method and device for analyzing jump link in page Download PDF

Info

Publication number
CN110046310B
CN110046310B CN201910263875.3A CN201910263875A CN110046310B CN 110046310 B CN110046310 B CN 110046310B CN 201910263875 A CN201910263875 A CN 201910263875A CN 110046310 B CN110046310 B CN 110046310B
Authority
CN
China
Prior art keywords
webpage
link
jump
page
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910263875.3A
Other languages
Chinese (zh)
Other versions
CN110046310A (en
Inventor
钱宝坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing ByteDance Network Technology Co Ltd
Original Assignee
Beijing ByteDance Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing ByteDance Network Technology Co Ltd filed Critical Beijing ByteDance Network Technology Co Ltd
Priority to CN201910263875.3A priority Critical patent/CN110046310B/en
Publication of CN110046310A publication Critical patent/CN110046310A/en
Application granted granted Critical
Publication of CN110046310B publication Critical patent/CN110046310B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • G06F16/9558Details of hyperlinks; Management of linked annotations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking

Abstract

The disclosure discloses a method, an apparatus, an electronic device and a computer-readable storage medium for analyzing jump links in a page. The method for analyzing the jump link in the page comprises the following steps: determining webpage information of a target webpage corresponding to a jump link of a current webpage to be embedded; determining whether the webpage information meets preset regulations; and in response to the webpage information meeting a preset rule, embedding the jump link into the current webpage. The webpage information of the target webpage corresponding to the jump link to be embedded into the current webpage is determined to be in accordance with the preset regulation or not, and the jump link corresponding to the webpage information in accordance with the preset regulation is embedded into the current webpage, so that the technical problems that in the prior art, when the jump link is embedded, the webpage information of the corresponding target webpage possibly violates the requirements of related laws and regulations, the current webpage operator is adversely affected, and even some legal risks are brought are solved.

Description

Method and device for analyzing jump link in page
Technical Field
The present disclosure relates to the field of information processing technologies, and in particular, to a method and an apparatus for analyzing a jump link in a page, an electronic device, and a computer-readable storage medium.
Background
With the development of mobile communication technology and the popularization of mobile terminals such as mobile phone terminals and tablet computers, it is more convenient for mobile users to browse web pages. In this context, various Applications (APPs) applied to the mobile terminal are on the rise, such as a browser, a shopping APP, a microblog APP, and the like.
In order to facilitate the service user to quickly access the page within the application, the service provider often embeds several jump links (generally URLs of the target page) in the relevant page, so that the mobile terminal can jump directly to the corresponding target page according to the jump links. For example, a commodity link of a certain shopping website may be embedded in a webpage displayed by a browser, and the mobile terminal may directly jump the display page to a commodity page corresponding to the commodity link in the shopping website according to the commodity link.
However, the web page information corresponding to these jumped links is uncontrollable for the current web page operator, and the content thereof may violate the requirements of the relevant laws and regulations, thereby causing adverse effects to the current web page operator and even bringing some legal risks.
Disclosure of Invention
In a first aspect, an embodiment of the present disclosure provides a method for analyzing a jump link in a page, including: determining webpage information of a target webpage corresponding to a jump link of a current webpage to be embedded; determining whether the webpage information meets preset regulations; and in response to the webpage information meeting a preset rule, embedding the jump link into the current webpage.
Further, the determining of the web page information of the target web page corresponding to the jump link to be embedded into the current web page includes:
loading the target webpage into a browser where the current webpage is located;
simulating click behavior for web page links within the target web page;
acquiring a webpage corresponding to the webpage link according to the clicking behavior, and acquiring a link jump address from the webpage corresponding to the webpage link;
and acquiring the webpage information according to the link jump address.
Further, the acquiring the webpage information according to the link jump address includes:
determining the calling times of each link jump address; and taking the sum of the calling times of the link jump addresses as the webpage information.
Further, the acquiring the webpage information according to the link jump address includes:
forming a jump target page list according to the link jump address;
and acquiring webpage contents corresponding to all the jump target pages according to the jump target page list, and taking the webpage contents as the webpage information.
Further, the determining whether the webpage information meets preset regulations includes:
classifying the link jump addresses according to a triggering mode, and analyzing webpage contents contained in each link jump address to obtain webpage data corresponding to each link jump address;
determining the priority of the corresponding link jump address according to the webpage data;
and sequentially determining whether the webpage information corresponding to each link jump address meets preset regulations or not according to the priority.
Further, the simulating click behavior for the web page link in the target web page includes:
enabling the browser to inject a page script for triggering an interaction event so as to simulate a click behavior aiming at a webpage link in the target webpage; alternatively, the first and second electrodes may be,
simulating click behaviors aiming at the webpage links in the target webpage through a synthetic interactive simulation interface provided by the browser; alternatively, the first and second electrodes may be,
and sending a system interaction event to a window of the browser to realize the simulation of the click behavior of the webpage link in the target webpage.
Further, before the simulating click behavior for the web page link in the target web page, the method further comprises:
disabling at least one of a page jump behavior and a popup interactive behavior in the browser.
Further, the preset regulation is at least one of legal regulation, compliance regulation, no invasion of the benefit of a website operator, and the total calling number of the link jump address reaching the preset calling number.
In a second aspect, an apparatus for analyzing jump links in a page is provided in an embodiment of the present disclosure, including: the webpage information determining module is used for determining webpage information of a target webpage corresponding to a skip link to be embedded into a current webpage; the judging module is used for determining whether the webpage information accords with preset regulations; and the embedding module is used for embedding the jump link into the current webpage in response to the fact that the webpage information meets the preset regulation.
Further, the webpage information determining module includes:
the loading unit is used for loading the target webpage into a browser where the current webpage is located;
the click simulation unit is used for simulating click behaviors aiming at the webpage links in the target webpage;
the address acquisition unit is used for acquiring a webpage corresponding to the webpage link according to the clicking behavior and acquiring a link jump address from the webpage corresponding to the webpage link;
and the information acquisition unit is used for acquiring the webpage information according to the link jump address.
Further, the information acquiring unit is specifically configured to: determining the calling times of each link jump address; and taking the sum of the calling times of the link jump addresses as the webpage information.
Further, the information acquiring unit is specifically configured to: forming a jump target page list according to the link jump address; and acquiring webpage contents corresponding to all the jump target pages according to the jump target page list, and taking the webpage contents as the webpage information.
Further, the determining module is specifically configured to: classifying the link jump addresses according to a triggering mode, and analyzing webpage contents contained in the link jump addresses to obtain webpage data corresponding to the link jump addresses; determining the priority of the corresponding link jump address according to the webpage data; and sequentially determining whether the webpage information corresponding to each link jump address meets preset regulations or not according to the priority.
Further, the simulated click unit is specifically configured to: enabling the browser to inject a page script for triggering an interaction event so as to simulate a click behavior aiming at a webpage link in the target webpage; or, simulating click behavior aiming at the webpage link in the target webpage through a synthetic interactive simulation interface provided by the browser; or sending a system interaction event to a window of the browser to simulate clicking behavior of a webpage link in the target webpage.
Further, the apparatus further comprises:
and the forbidding module is used for forbidding the page jump behavior and the popup interactive behavior in the browser before simulating the click behavior aiming at the webpage link in the target webpage.
Further, the preset regulation is at least one of legal regulation, compliance regulation, no invasion of the benefit of a website operator, and the total calling number of the link jump address reaching the preset calling number.
In a third aspect, an embodiment of the present disclosure provides an electronic device, including: at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of analyzing jump links in pages according to any of the first aspects.
In a fourth aspect, the present disclosure provides a non-transitory computer-readable storage medium, which stores computer instructions for causing a computer to execute the method for analyzing jump links in a page according to any one of the first aspect.
The disclosure discloses a method, an apparatus, an electronic device and a computer-readable storage medium for analyzing jump links in a page. The method for analyzing the jump link in the page comprises the following steps: determining webpage information of a target webpage corresponding to a jump link of a current webpage to be embedded; determining whether the webpage information meets preset regulations; and in response to the webpage information meeting a preset rule, embedding the jump link into the current webpage. The webpage information of the target webpage corresponding to the jump link to be embedded into the current webpage is determined to be in accordance with the preset regulation or not, and the jump link corresponding to the webpage information in accordance with the preset regulation is embedded into the current webpage, so that the technical problems that in the prior art, when the jump link is embedded, the webpage information of the corresponding target webpage possibly violates the requirements of related laws and regulations, and therefore adverse effects are caused to current webpage operators, and even some legal risks are brought are solved.
The foregoing is a summary of the present disclosure, and for the purposes of promoting a clear understanding of the technical means of the present disclosure, the present disclosure may be embodied in other specific forms without departing from the spirit or essential attributes thereof.
Drawings
In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present disclosure, and other drawings can be obtained according to the drawings without creative efforts for those skilled in the art.
Fig. 1 is a flowchart of a first embodiment of a method for analyzing a jump link in a page according to an embodiment of the present disclosure;
FIG. 2 is a flowchart of a second embodiment of a method for analyzing jump links in a page according to the present disclosure;
FIG. 3 is a schematic structural diagram of a first embodiment of an apparatus for analyzing jump links in a page according to an embodiment of the present disclosure;
FIG. 4 is a schematic structural diagram of a second embodiment of an apparatus for analyzing jump links in a page according to an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of an electronic device provided according to an embodiment of the present disclosure.
Detailed Description
The embodiments of the present disclosure are described below with specific examples, and other advantages and effects of the present disclosure will be readily apparent to those skilled in the art from the disclosure in the specification. It is to be understood that the described embodiments are merely illustrative of some, and not restrictive, of the embodiments of the disclosure. The disclosure may be embodied or carried out in various other specific embodiments, and various modifications and changes may be made in the details within the description without departing from the spirit of the disclosure. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.
It is noted that various aspects of the embodiments are described below within the scope of the appended claims. It should be apparent that the aspects described herein may be embodied in a wide variety of forms and that any specific structure and/or function described herein is merely illustrative. Based on the disclosure, one skilled in the art should appreciate that one aspect described herein may be implemented independently of any other aspects and that two or more of these aspects may be combined in various ways. For example, an apparatus may be implemented and/or a method practiced using any number of the aspects set forth herein. Additionally, such an apparatus may be implemented and/or such a method may be practiced using other structure and/or functionality in addition to one or more of the aspects set forth herein.
It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present disclosure, and the drawings only show the components related to the present disclosure rather than the number, shape and size of the components in actual implementation, and the type, amount and ratio of the components in actual implementation may be changed arbitrarily, and the layout of the components may be more complicated.
In addition, in the following description, specific details are provided to facilitate a thorough understanding of the examples. However, it will be understood by those skilled in the art that the aspects may be practiced without these specific details.
Fig. 1 is a flowchart of a first embodiment of a method for analyzing a jump link in a page according to an embodiment of the present disclosure, where the method for analyzing a jump link in a page according to this embodiment may be performed by an apparatus for analyzing a jump link in a page, where the apparatus for analyzing a jump link in a page may be implemented as software or as a combination of software and hardware, and the apparatus for analyzing a jump link in a page may be integrated in a certain device in a system for analyzing a jump link in a page, such as a server for analyzing a jump link in a page or a terminal device for analyzing a jump link in a page. As shown in fig. 1, the method comprises the steps of:
step S101, determining webpage information of a target webpage corresponding to a jump link to be embedded into a current webpage;
in this embodiment, the current web page may be a home page of a certain website, for example, a 360 th navigation page, or may be a jump page generated on the home page of a certain website or a corresponding jump page, for example, a secondary jump page.
The jump link may be, but is not limited to, any of the following: text links, picture links, video links, or the like.
The web page information includes, but is not limited to, any one or more of the following: the webpage address corresponding to the target webpage, the address of the jump link in the target webpage, the total calling number of the address of the jump link in the target webpage, the classification of the address of the jump link in the target webpage, the webpage data of the address of the jump link in the target webpage, and the webpage content corresponding to the jump link in the target webpage.
Step S102: determining whether the webpage information meets preset regulations;
wherein the preset regulation includes but is not limited to at least one of law, regulation, not infringing the benefit of the website operator, and the total number of calls of the link jump address reaching the preset number of calls.
Step S103: and in response to the webpage information meeting a preset rule, embedding the jump link into the current webpage.
In this embodiment, if it is determined that the web page information of the target web page conforms to a predetermined specification, such as legal, compliant (for example, conforming to various specifications of the website operator), and does not infringe the benefit of the website operator, the jump link is embedded into the current web page so as to be displayed to the user. And if the webpage information of the target webpage does not accord with the preset regulation, the jump link is not embedded into the current webpage or deleted from the current webpage, namely the jump link is not displayed to the user.
The webpage information of the target webpage corresponding to the jump link to be embedded into the current webpage is determined to be in accordance with the preset regulation or not, and the jump link corresponding to the webpage information in accordance with the preset regulation is embedded into the current webpage, so that the technical problems that in the prior art, when the jump link is embedded, the webpage information of the corresponding target webpage possibly violates the requirements of related laws and regulations, the current webpage operator is adversely affected, and even some legal risks are brought are solved.
Fig. 2 is a flowchart of a second embodiment of a method for analyzing a jump link in a page according to the present disclosure, where this embodiment further defines step 101 on the basis of the foregoing embodiment, the method for analyzing a jump link in a page according to this embodiment may be executed by an apparatus for analyzing a jump link in a page, the apparatus for analyzing a jump link in a page may be implemented as software, or implemented as a combination of software and hardware, and the apparatus for analyzing a jump link in a page may be integrated in a certain device in a system for analyzing a jump link in a page, such as a server for analyzing a jump link in a page or a terminal device for analyzing a jump link in a page. As shown in fig. 2, the method comprises the steps of:
step S201: loading the target webpage into a browser where the current webpage is located;
step S202: simulating the click behavior of the webpage link in the target webpage;
in an alternative embodiment, step S202 includes: enabling the browser to inject a page script for triggering an interaction event so as to simulate a click behavior aiming at a webpage link in the target webpage; or, simulating click behavior aiming at the webpage link in the target webpage through a synthetic interactive simulation interface provided by the browser; or sending a system interaction event to a window of the browser to simulate clicking behavior of a webpage link in the target webpage.
In an alternative embodiment, before performing step S202, the method further comprises:
disabling at least one of a page jump behavior and a popup interactive behavior in the browser.
The behavior capable of carrying out page jump comprises any one of JS, meta, from submit, A href, open, reload, place state and pushState, a jump address is intercepted and stored in the browser.
The popup interactive behavior comprises any one of alert, confirm, prompt, print and input file, so that influence generated when the click behavior is simulated is avoided.
In addition, a plurality of JS methods can be added to the current page, and are used for acquiring a target Uniform Resource Locator (URL) array intercepted by triggering the jump in the browser, and for clearing jump capture accumulated content so as to acquire the URL array after the jump is generated next time.
Step S203: acquiring a webpage corresponding to the webpage link according to the clicking behavior, and acquiring a link jump address from the webpage corresponding to the webpage link;
and the link jump address is URL connection obtained according to the webpage link.
Step S204: acquiring the webpage information according to the link jump address;
step S205: determining whether the webpage information meets preset regulations;
step S206: and in response to the webpage information meeting a preset rule, embedding the jump link into the current webpage.
According to the webpage information embedding method and device, the target webpage is loaded into the browser where the current webpage is located, the clicking behavior aiming at the webpage link in the target webpage is simulated, the webpage corresponding to the webpage link is obtained according to the clicking behavior, the link jump address is obtained from the webpage corresponding to the webpage link, the webpage information is obtained according to the link jump address, and the jump link corresponding to the webpage information meeting the preset regulation is embedded into the current webpage, so that the technical problems that in the prior art, when the jump link is embedded, the webpage information of the corresponding target webpage possibly violates the requirements of relevant laws and regulations, therefore, adverse effects are caused to current webpage operators, and even some legal risks are brought are solved.
In an alternative embodiment, step S203 comprises:
determining the calling times of each link jump address; and taking the sum of the calling times of the link jump addresses as the webpage information.
The webpage link may include a plurality of link jump addresses, the number of times of calling of each link jump address is counted, the larger the number of times of calling, the higher the probability that the link jump address is clicked is, when the total number of calling of all the link jump addresses exceeds the preset number of times of calling, the higher the probability that the corresponding jump link is clicked is, namely the jump link is determined to meet the preset regulation, and the jump link is embedded into the current webpage.
In an alternative embodiment, step S203 comprises:
step S2031: forming a jump target page list according to the link jump address;
step S2032: and acquiring webpage contents corresponding to all the jump target pages according to the jump target page list, and taking the webpage contents as the webpage information.
In this embodiment, the web page content includes, but is not limited to, any one or more of the following: html (body of web page), javascript's file (which determines the behavior of the web page, e.g., various event responses, such as click), style file (which determines the attributes of the elements, e.g., appearance, size), picture, frame, iframe, etc. And determining whether the webpage content meets preset regulations, such as whether the webpage content meets the regulations of a qualified law, a qualified law or a website operator, and if so, embedding the jump link into the current webpage.
In an alternative embodiment, step S204 further comprises:
step S2041: classifying the link jump addresses according to a triggering mode, and analyzing webpage contents contained in the link jump addresses to obtain webpage data corresponding to the link jump addresses;
the classification can be performed according to a triggering mode, for example, the link jump addresses triggered in a full screen mode are classified into one class, the link jump addresses triggered in an interactive mode are classified into one class, and then the classified link jump addresses are analyzed and counted to obtain webpage data.
The webpage data include the link jump addresses of the same webpage information, and the like, and for convenience of subsequent processing, the webpage data can be subjected to quantization processing to obtain corresponding webpage data values. In addition, the number of times of calling of the link jump address and the like may be included.
Step S2042: determining the priority of the corresponding link jump address according to the webpage data;
in this embodiment, since the web page data has web page data of different link jump addresses in the target page. If the web page data value of a certain link jump address is large, the target page is probably a two-jump page which is more expected to be accessed by a user, and the link jump address is processed preferentially. Therefore, the priority of each link jump address can be determined based on the size of the value of the web page data, and the higher the value is, the higher the priority of the corresponding link jump address is.
Step S2043: and sequentially determining whether the webpage information corresponding to each link jump address meets preset regulations or not according to the priority.
Fig. 3 is a schematic structural diagram of a first embodiment of an apparatus for analyzing a jump link in a page according to an embodiment of the present disclosure, where the apparatus for analyzing a jump link in a page may be implemented as software or as a combination of software and hardware, and the apparatus for analyzing a jump link in a page may be integrated in a device in a system for analyzing a jump link in a page, such as a server for analyzing a jump link in a page or a terminal device for analyzing a jump link in a page. As shown in fig. 3, the apparatus 300 includes: a web page information determination module 301, a judgment module 302, and an embedding module 303. Wherein the content of the first and second substances,
the web page information determining module 301 is configured to determine web page information of a target web page corresponding to a jump link to be embedded in a current web page;
the judging module 302 is configured to determine whether the webpage information meets a preset specification;
the embedding module 303 is configured to embed the jump link into the current webpage in response to that the webpage information meets a preset rule.
The apparatus shown in fig. 3 can perform the method of the embodiment shown in fig. 1, and reference may be made to the related description of the embodiment shown in fig. 1 for a part of this embodiment that is not described in detail. The implementation process and technical effect of the technical solution refer to the description in the embodiment shown in fig. 1, and are not described herein again.
Fig. 4 is a schematic structural diagram of a second embodiment of an apparatus for analyzing a jump link in a page according to an embodiment of the present disclosure, where the apparatus for analyzing a jump link in a page may be implemented as software or as a combination of software and hardware, and the apparatus for analyzing a jump link in a page may be integrated in a certain device in a system for analyzing a jump link in a page, such as a server for analyzing a jump link in a page or a terminal device for analyzing a jump link in a page. As shown in fig. 4, the apparatus 400 includes various modules in the apparatus 300 for analyzing jump links in a page according to the first embodiment, and further defines a web page information determining module 301. Wherein, the web page information determining module 301 comprises a loading unit 401, a simulated click unit 402, an address obtaining unit 403 and an information obtaining unit 404, wherein,
the loading unit 401 is configured to load the target webpage to a browser where the current webpage is located;
the simulated click unit 402 is used for simulating click behaviors aiming at the webpage links in the target webpage;
the address obtaining unit 403 is configured to obtain a webpage corresponding to the webpage link according to the click behavior, and obtain a link jump address from the webpage corresponding to the webpage link;
the information obtaining unit 404 is configured to obtain the web page information according to the link jump address.
Further, the information obtaining unit 404 is specifically configured to: determining the calling times of each link jump address; and taking the sum of the calling times of the link jump addresses as the webpage information.
Further, the information obtaining unit 404 is specifically configured to: forming a jump target page list according to the link jump address; and acquiring webpage contents corresponding to all the jump target pages according to the jump target page list, and taking the webpage contents as the webpage information.
Further, the determining module 302 is specifically configured to: classifying the link jump addresses according to a triggering mode, and analyzing webpage contents contained in each link jump address to obtain webpage data corresponding to each link jump address; determining the priority of the corresponding link jump address according to the webpage data; and sequentially determining whether the webpage information corresponding to each link jump address meets preset regulations or not according to the priority.
Further, the simulated click unit 402 is further configured to: enabling the browser to inject a page script for triggering an interaction event so as to simulate a click behavior aiming at a webpage link in the target webpage; or, simulating click behavior aiming at the webpage link in the target webpage through a synthetic interactive simulation interface provided by the browser; or sending a system interaction event to a window of the browser to simulate clicking behavior of a webpage link in the target webpage.
Further, the apparatus 400 further comprises: a disabling module 405; wherein the content of the first and second substances,
the disabling module 405 is configured to disable at least one of a page jump behavior and a popup interactive behavior in the browser before simulating a click behavior for a web page link within the target web page.
Further, the preset regulation is at least one of legal regulation, compliance regulation, no invasion of the benefit of a website operator, and the total calling number of the link jump address reaching the preset calling number.
The apparatus shown in fig. 4 can perform the method of the embodiment shown in fig. 2, and reference may be made to the related description of the embodiment shown in fig. 2 for a part of this embodiment that is not described in detail. The implementation process and technical effect of the technical solution refer to the description in the embodiment shown in fig. 2, and are not described herein again.
Referring now to FIG. 5, a block diagram of an electronic device 500 suitable for use in implementing embodiments of the present disclosure is shown. The electronic devices in the embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., car navigation terminals), and the like, and fixed terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 5 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 5, electronic device 500 may include a processing means (e.g., central processing unit, graphics processor, etc.) 501 that may perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)502 or a program loaded from a storage means 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data necessary for the operation of the electronic apparatus 500 are also stored. The processing device 501, the ROM 502, and the RAM 503 are connected to each other through a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.
Generally, the following devices may be connected to the I/O interface 505: input devices 506 including, for example, a touch screen, touch pad, keyboard, mouse, image sensor, microphone, accelerometer, gyroscope, etc.; output devices 507 including, for example, a Liquid Crystal Display (LCD), speakers, vibrators, and the like; storage devices 508 including, for example, magnetic tape, hard disk, etc.; and a communication device 509. The communication means 509 may allow the electronic device 500 to communicate with other devices wirelessly or by wire to exchange data. While fig. 5 illustrates an electronic device 500 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 509, or installed from the storage means 508, or installed from the ROM 502. The computer program performs the above-described functions defined in the methods of the embodiments of the present disclosure when executed by the processing device 501.
It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.
The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: determining webpage information of a target webpage corresponding to a jump link of a current webpage to be embedded; determining whether the webpage information meets preset regulations; and in response to the webpage information meeting a preset rule, embedding the jump link into the current webpage.
Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present disclosure may be implemented by software or hardware. Where the name of an element does not in some cases constitute a limitation on the element itself.
The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure herein is not limited to the particular combination of features described above, but also encompasses other embodiments in which any combination of the features described above or their equivalents does not depart from the spirit of the disclosure. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.

Claims (14)

1. A method for analyzing jumped links in a page, comprising:
determining webpage information of a target webpage corresponding to a jump link of a current webpage to be embedded; the target webpage comprises a webpage link, and a webpage corresponding to the webpage link comprises a link jump address;
classifying the link jump addresses according to a triggering mode, and analyzing webpage contents contained in the link jump addresses to obtain webpage data corresponding to the link jump addresses;
determining the priority of the corresponding link jump address according to the webpage data;
determining whether the webpage information corresponding to each link jump address meets preset regulations or not in sequence according to the priority; the preset specification is at least one of legal, compliance, no infringement on the benefit of a website operator, and the total calling number of the link jump address reaches the preset calling number;
and in response to the webpage information meeting a preset rule, embedding the jump link into the current webpage.
2. The method of analyzing jumped links in a page as claimed in claim 1, wherein determining the web page information of the target web page corresponding to the jumped link to be embedded in the current web page comprises:
loading the target webpage into a browser where the current webpage is located;
simulating click behavior for web page links within the target web page;
acquiring a webpage corresponding to the webpage link according to the clicking behavior;
acquiring a link jump address from a webpage corresponding to the webpage link;
and acquiring the webpage information according to the link jump address.
3. The method for analyzing jump links in pages according to claim 2, wherein obtaining the web page information according to the link jump address comprises:
determining the calling times of each link jump address;
and taking the sum of the calling times of the link jump addresses as the webpage information.
4. The method for analyzing jump links in pages according to claim 2, wherein said obtaining the web page information according to the link jump address comprises:
forming a jump target page list according to the link jump address;
and acquiring webpage contents corresponding to all the jump target pages according to the jump target page list, and taking the webpage contents as the webpage information.
5. The method of analyzing jump links in a page of any of claims 2-4, wherein said simulating click behavior for web page links within said target web page comprises:
enabling the browser to inject a page script for triggering an interaction event so as to simulate a click behavior aiming at a webpage link in the target webpage; alternatively, the first and second electrodes may be,
simulating click behaviors aiming at the webpage links in the target webpage through a synthetic interactive simulation interface provided by the browser; alternatively, the first and second electrodes may be,
and sending a system interaction event to a window of the browser to realize the simulation of the click behavior of the webpage link in the target webpage.
6. A method of analyzing jump links in a page as recited in any of claims 2-4, wherein prior to said simulating click behavior for web page links within the target web page, the method further comprises:
disabling at least one of a page jump behavior and a popup interactive behavior in the browser.
7. An apparatus for analyzing jumped links in a page, comprising:
the webpage information determining module is used for determining webpage information of a target webpage corresponding to a skip link to be embedded into a current webpage; the target webpage comprises a webpage link, and a webpage corresponding to the webpage link comprises a link jump address;
the judging module is used for classifying the link jump addresses according to a triggering mode and analyzing webpage contents contained in the link jump addresses to obtain webpage data corresponding to the link jump addresses; determining the priority of the corresponding link jump address according to the webpage data; determining whether the webpage information corresponding to each link jump address meets preset regulations or not in sequence according to the priority; the preset specification is at least one of legal, compliance, no infringement on the benefit of a website operator, and the total calling number of the link jump address reaches the preset calling number;
and the embedding module is used for embedding the jump link into the current webpage in response to the fact that the webpage information meets the preset regulation.
8. The apparatus for analyzing jump links in a page as recited in claim 7, wherein said web page information determining module comprises:
the loading unit is used for loading the target webpage to a browser where the current webpage is located;
the click simulation unit is used for simulating click behaviors aiming at the webpage links in the target webpage;
the address acquisition unit is used for acquiring a webpage corresponding to the webpage link according to the clicking behavior and acquiring a link jump address from the webpage corresponding to the webpage link;
and the information acquisition unit is used for acquiring the webpage information according to the link jump address.
9. The apparatus for analyzing jump links in a page as recited in claim 8, wherein the information obtaining unit is specifically configured to: determining the calling times of each link jump address; and taking the sum of the calling times of the link jump addresses as the webpage information.
10. The apparatus for analyzing jump links in a page as recited in claim 8, wherein the information obtaining unit is specifically configured to: forming a jump target page list according to the link jump address; and acquiring webpage contents corresponding to all the jump target pages according to the jump target page list, and taking the webpage contents as the webpage information.
11. Apparatus for analyzing jump links in a page according to any of the claims 8-10, characterized in that the simulated click unit is specifically adapted to: enabling the browser to inject a page script for triggering an interaction event so as to simulate a click behavior aiming at a webpage link in the target webpage; or, simulating click behavior aiming at the webpage link in the target webpage through a synthetic interactive simulation interface provided by the browser; or sending a system interaction event to a window of the browser to simulate clicking behavior of a webpage link in the target webpage.
12. The apparatus for analyzing jump links in a page according to any of claims 8-10, characterized in that said apparatus further comprises:
a disabling module for disabling at least one of a page jump behavior and a popup interactive behavior in the browser prior to simulating a click behavior for a web link within the target web page.
13. An electronic device, comprising:
a memory for storing non-transitory computer readable instructions; and
a processor for executing the computer readable instructions such that the processor when executing implements a method of analyzing jump links in a page according to any of claims 1-6.
14. A computer readable storage medium storing non-transitory computer readable instructions which, when executed by a computer, cause the computer to perform the method of analyzing jump links in a page of any of claims 1-6.
CN201910263875.3A 2019-04-03 2019-04-03 Method and device for analyzing jump link in page Active CN110046310B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910263875.3A CN110046310B (en) 2019-04-03 2019-04-03 Method and device for analyzing jump link in page

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910263875.3A CN110046310B (en) 2019-04-03 2019-04-03 Method and device for analyzing jump link in page

Publications (2)

Publication Number Publication Date
CN110046310A CN110046310A (en) 2019-07-23
CN110046310B true CN110046310B (en) 2020-12-08

Family

ID=67275942

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910263875.3A Active CN110046310B (en) 2019-04-03 2019-04-03 Method and device for analyzing jump link in page

Country Status (1)

Country Link
CN (1) CN110046310B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110781432A (en) * 2019-10-08 2020-02-11 北京字节跳动网络技术有限公司 Page jump method and device, electronic equipment and computer readable storage medium
CN111104618A (en) * 2019-12-19 2020-05-05 秒针信息技术有限公司 Webpage skipping method and device
CN111400627B (en) * 2020-03-09 2023-07-07 政采云有限公司 Information acquisition method and device, electronic equipment and readable storage medium
CN112284409B (en) * 2020-10-23 2024-03-08 上海博泰悦臻网络技术服务有限公司 Method, system and storage medium for navigation based on social software sharing information
CN113837772A (en) * 2021-09-24 2021-12-24 支付宝(杭州)信息技术有限公司 Method, device and equipment for auditing marketing information
CN113590985B (en) * 2021-09-29 2022-01-04 北京每日优鲜电子商务有限公司 Page jump configuration method and device, electronic equipment and computer readable medium
CN117033742B (en) * 2023-08-18 2024-02-20 广东轻工职业技术学院 Data security acquisition method based on artificial intelligence

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003248696A (en) * 2002-02-22 2003-09-05 Nippon Telegr & Teleph Corp <Ntt> Page rating/filtering method, device, and program, and computer readable recording medium recording the program
CN104699789A (en) * 2015-03-16 2015-06-10 北京盛世光明软件股份有限公司 Method and system for embedding webpage
CN105095514A (en) * 2015-09-09 2015-11-25 北京奇虎科技有限公司 Method and device for browser preloading
CN105512126A (en) * 2014-09-24 2016-04-20 腾讯科技(深圳)有限公司 Methods and devices for filtering and hiding webpage advertisement, and methods and devices for issuing filtering and hiding rules of webpage advertisement
CN106383728A (en) * 2016-09-05 2017-02-08 微梦创科网络科技(中国)有限公司 Method and device for asynchronously loading hyperlink in mobile application
CN108959565A (en) * 2018-07-04 2018-12-07 广东小天才科技有限公司 A kind of method, apparatus and server of web page contents filtering

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9773249B2 (en) * 2008-02-08 2017-09-26 Excalibur Ip, Llc Method and system for presenting targeted advertisements
CN105335139B (en) * 2014-07-24 2019-05-17 广州市动景计算机科技有限公司 Webpage display process and device
CN105183793A (en) * 2015-08-24 2015-12-23 成都秋雷科技有限责任公司 Method for quickly intercepting popup windows of webpage
CN106453266A (en) * 2016-09-20 2017-02-22 微梦创科网络科技(中国)有限公司 Abnormal networking request detection method and apparatus
CN108171049A (en) * 2017-12-27 2018-06-15 深圳豪客互联网有限公司 A kind of malicious link clicks control method and control system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003248696A (en) * 2002-02-22 2003-09-05 Nippon Telegr & Teleph Corp <Ntt> Page rating/filtering method, device, and program, and computer readable recording medium recording the program
CN105512126A (en) * 2014-09-24 2016-04-20 腾讯科技(深圳)有限公司 Methods and devices for filtering and hiding webpage advertisement, and methods and devices for issuing filtering and hiding rules of webpage advertisement
CN104699789A (en) * 2015-03-16 2015-06-10 北京盛世光明软件股份有限公司 Method and system for embedding webpage
CN105095514A (en) * 2015-09-09 2015-11-25 北京奇虎科技有限公司 Method and device for browser preloading
CN106383728A (en) * 2016-09-05 2017-02-08 微梦创科网络科技(中国)有限公司 Method and device for asynchronously loading hyperlink in mobile application
CN108959565A (en) * 2018-07-04 2018-12-07 广东小天才科技有限公司 A kind of method, apparatus and server of web page contents filtering

Also Published As

Publication number Publication date
CN110046310A (en) 2019-07-23

Similar Documents

Publication Publication Date Title
CN110046310B (en) Method and device for analyzing jump link in page
CN110634049B (en) Page display content processing method and device, electronic equipment and readable medium
CN110362488B (en) Page testing method and device, electronic equipment and storage medium
CN110097397B (en) Feedback-based information pushing method and device and electronic equipment
CN110471709B (en) Method, device, medium and electronic equipment for accelerating webpage opening speed
CN110278447B (en) Video pushing method and device based on continuous features and electronic equipment
CN111367516B (en) Application interface generation method and device and electronic equipment
CN112684968A (en) Page display method and device, electronic equipment and computer readable medium
WO2014176906A1 (en) Online video playing method and apparatus and computer readable medium
CN110781432A (en) Page jump method and device, electronic equipment and computer readable storage medium
CN109902726B (en) Resume information processing method and device
CN104573023B (en) A kind of Web page picture inspection method, device and computing device
CN109492163B (en) List display recording method and device, terminal equipment and storage medium
CN111783010B (en) Webpage blank page monitoring method, device, terminal and storage medium
CN110618811B (en) Information presentation method and device
CN111083145A (en) Message sending method and device and electronic equipment
CN111259381A (en) Page interaction method and device, computer readable medium and electronic equipment
CN111353296A (en) Article processing method and device, electronic equipment and computer-readable storage medium
CN111984888A (en) Page rendering method and device, electronic equipment and computer readable medium
CN113590985B (en) Page jump configuration method and device, electronic equipment and computer readable medium
CN113868538B (en) Information processing method, device, equipment and medium
CN114153462A (en) Client source code processing method and device, storage medium and electronic equipment
CN109669720B (en) Chain type asynchronous request processing method and device based on Promise and electronic equipment
CN112287261A (en) Resource loading method and electronic equipment
CN116561015B (en) Map application testing method, electronic device and computer readable medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant