CN110740074B - Network address detection method and device and electronic equipment - Google Patents

Network address detection method and device and electronic equipment Download PDF

Info

Publication number
CN110740074B
CN110740074B CN201910780723.0A CN201910780723A CN110740074B CN 110740074 B CN110740074 B CN 110740074B CN 201910780723 A CN201910780723 A CN 201910780723A CN 110740074 B CN110740074 B CN 110740074B
Authority
CN
China
Prior art keywords
network address
determining
network
seed
addresses
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910780723.0A
Other languages
Chinese (zh)
Other versions
CN110740074A (en
Inventor
王玲玉
付宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced New Technologies Co Ltd
Advantageous New Technologies Co Ltd
Original Assignee
Advanced New Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Advanced New Technologies Co Ltd filed Critical Advanced New Technologies Co Ltd
Priority to CN201910780723.0A priority Critical patent/CN110740074B/en
Publication of CN110740074A publication Critical patent/CN110740074A/en
Application granted granted Critical
Publication of CN110740074B publication Critical patent/CN110740074B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0805Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • G06F16/9566URL specific, e.g. using aliases, detecting broken or misspelled links
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • G06F16/986Document structures and storage, e.g. HTML extensions

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Environmental & Geological Engineering (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The embodiment of the specification provides a method and a device for detecting a network address and electronic equipment. One of the methods comprises: acquiring behavior data of a user for executing webpage operation; extracting a network address contained in the behavior data; and carrying out availability detection on the network address. In one embodiment, the testing method can detect the availability of the network address of the webpage which is not configured in the application platform in advance but can be accessed from the application platform, and can improve the reliability of network address detection.

Description

Network address detection method and device and electronic equipment
Technical Field
The present disclosure relates to the field of network address detection technologies, and in particular, to a method and an apparatus for detecting a network address, and an electronic device.
Background
With the rapid development of internet technology and the popularization of device intelligence, more and more users are used to perform human-computer interaction operations on an application interface presented on a display screen of electronic equipment by running applications on the electronic equipment such as a smart phone, a palm computer and a tablet computer to interact with the applications so as to obtain corresponding application functions or application services.
The merchant can reside H5 service or small program in the application platform, and provides corresponding business service for the user. Specifically, the merchant needs to pre-configure the network address of the initial web page in the application platform, so that the user can perform a page operation to view the initial web page corresponding to the network address of the merchant pre-configured in the application platform. In the actual operation process, the user can also view other webpages of the service through the initial webpage.
There is a need for an application platform to monitor its availability by detecting network addresses configured in the application platform. Since the application platform cannot obtain the network addresses of other web pages corresponding to the services from the merchant, the availability of the network addresses of other web pages cannot be detected. Therefore, there is a need to provide a reliable network address detection scheme.
Disclosure of Invention
The embodiment of the specification provides a new technical scheme for detecting a network address in an application platform.
According to a first aspect of the present specification, there is provided a method for detecting a network address, including:
acquiring behavior data of a user for executing webpage operation;
extracting a network address contained in the behavior data;
and carrying out availability detection on the network address.
Optionally, the method further includes:
classifying the network addresses according to the association degree among the network addresses;
determining a business object to which the corresponding classification belongs according to the network address contained in each classification; and
and obtaining the service object to which the abnormal network address detected by the availability detection belongs according to the service object to which each classification belongs.
Optionally, the determining the business object to which the corresponding classification belongs includes:
searching a preset initial address contained in each classification; and
and taking the known business object to which the initial address belongs as the business object to which the corresponding classification belongs.
Optionally, the method further includes: extracting a jump relation between network addresses existing in the behavior data, wherein the jump relation represents a jump to a network address of another webpage via the network address of the webpage; and
wherein the classifying the network address comprises:
and classifying the network addresses according to the association degree of the network addresses with the jump relation.
Optionally, the classifying the network address includes:
setting each target network address as a seed network address, wherein the target network address at least comprises the initial address;
acquiring all lower-level network addresses of the seed network address according to the jump relation;
according to the association degree between the seed network address and each lower-level network address, obtaining the lower-level network address which belongs to the same service object with the seed network address to form service association data;
taking the lower-level network address belonging to the same service object as the seed network address as a target network address;
and classifying the network addresses according to the service associated data corresponding to each network address.
Optionally, the classifying the network address further includes:
determining a word frequency-inverse text frequency index value of each lower-level network address;
and determining the association degree between the corresponding lower-level network address and the seed network address according to the word frequency-inverse text frequency index value.
Optionally, the determining the word frequency-inverse text frequency index value of each lower-level network address includes:
taking each lower-level network address as a current network address in turn;
determining the times of the current network address appearing in the seed network address as a first time;
determining the occurrence times of all the lower-level network addresses in the seed network address as a second time;
determining the word frequency of the current network address according to the first times and the second times;
determining the occurrence times of the current network address in all seed network addresses as a third time;
determining the total number of all seed network addresses;
determining the reverse text frequency index of the current network address according to the total number of all the network addresses and the third times;
and obtaining the word frequency-inverse text frequency index value of the current network address according to the word frequency and the inverse text frequency index.
Optionally, the method further includes:
under the condition that an abnormal network address is detected, determining an abnormal business object to which the abnormal network address belongs; and
and alarming to remind the abnormal business object.
Optionally, the method further includes:
extracting a jump relationship between network addresses present in the behavior data, wherein the jump relationship represents a jump to a network address of one web page via a network address of another web page, an
Obtaining associated data among network addresses according to the jump relation;
wherein the detecting the availability of the network address comprises:
and carrying out availability detection on the network address according to the associated data.
Optionally, the performing, according to the association data, availability detection on the network address includes:
determining the detection frequency of each network address according to the associated data; and
and according to the detection frequency, carrying out availability detection on the corresponding network address.
Optionally, the method further includes:
determining the jump probability of each network address according to the behavior data;
determining the detection frequency of the corresponding network address according to the jump probability; and
and according to the detection frequency, carrying out availability detection on the corresponding network address.
Optionally, the method further includes:
and displaying the abnormal network address under the condition that the abnormal network address is detected.
According to a second aspect of the present specification, there is provided an apparatus for detecting a network address, comprising:
the data acquisition module is used for acquiring behavior data of webpage operation executed by a user;
the address extraction module is used for extracting a network address contained in the behavior data;
and the availability detection module is used for carrying out availability detection on the network address.
Optionally, the apparatus further comprises:
the classification module is used for classifying the network addresses according to the association degree among the network addresses;
the object determining module is used for determining the business object to which the corresponding classification belongs according to the network address contained in each classification; and
and the abnormity determining module is used for obtaining the business object to which the abnormal network address detected by the availability detection belongs according to the business object to which each classification belongs.
Optionally, the object determination module is further configured to:
searching a preset initial address contained in each classification; and
and taking the known business object to which the initial address belongs as the business object to which the corresponding classification belongs.
Optionally, the apparatus further comprises:
a jump relation extraction module, configured to extract a jump relation between network addresses existing in the behavior data, where the jump relation indicates a network address that jumps to another web page via a network address of one web page; and
wherein the classification module is further configured to:
and classifying the network addresses according to the association degree of the network addresses with the jump relation.
Optionally, the classification module further includes:
a seed address setting unit, configured to set each target network address as a seed network address, where the target network address at least includes the initial address;
a lower address obtaining unit, configured to obtain all lower network addresses of the seed network address according to the jump relationship;
the associated data obtaining unit is used for obtaining the lower-level network address which belongs to the same service object with the seed network address according to the association degree between the seed network address and each lower-level network address to form service associated data;
the target address setting unit is used for taking the lower-level network address which belongs to the same service object with the seed network address as a target network address;
and the classification unit is used for classifying the network addresses according to the service associated data corresponding to each network address.
Optionally, the associated data obtaining unit is further configured to:
determining a word frequency-inverse text frequency index value of each lower-level network address;
and determining the association degree between the corresponding lower-level network address and the seed network address according to the word frequency-inverse text frequency index value.
Optionally, the determining the word frequency-inverse text frequency index value of each lower-level network address includes:
taking each lower-level network address as a current network address in turn;
determining the times of the current network address appearing in the seed network address as a first time;
determining the occurrence times of all the lower-level network addresses in the seed network address as a second time;
determining the word frequency of the current network address according to the first times and the second times;
determining the occurrence times of the current network address in all seed network addresses as a third time;
determining the total number of all seed network addresses;
determining the reverse text frequency index of the current network address according to the total number of all the network addresses and the third times;
and obtaining the word frequency-inverse text frequency index value of the current network address according to the word frequency and the inverse text frequency index.
Optionally, the apparatus further comprises:
the module is used for determining an abnormal business object to which the abnormal network address belongs under the condition that the abnormal network address is detected; and
and the module is used for alarming and reminding the abnormal business object.
Optionally, the apparatus further comprises:
means for extracting a jump relationship between network addresses present in the behavior data, wherein the jump relationship represents a network address jumping to another web page via a network address of one web page, and
a module for obtaining the associated data between the network addresses according to the jump relation;
wherein the detecting the availability of the network address comprises:
and the module is used for carrying out availability detection on the network address according to the associated data.
Optionally, the performing, according to the association data, availability detection on the network address includes:
determining the detection frequency of each network address according to the associated data; and
and according to the detection frequency, carrying out availability detection on the corresponding network address.
Optionally, the apparatus further comprises:
means for determining a hop probability for each network address based on the behavior data;
a module for determining the detection frequency of the corresponding network address according to the jump probability; and
and the module is used for carrying out availability detection on the corresponding network address according to the detection frequency.
Optionally, the apparatus further comprises:
and the module is used for displaying the abnormal network address under the condition that the abnormal network address is detected.
According to a third aspect of the present description, there is provided an electronic device comprising the apparatus according to the second aspect of the present description; alternatively, a processor and a memory are included, the memory being arranged to store executable instructions for controlling the processor to perform a method according to the first aspect of the specification.
Other features of the present description and advantages thereof will become apparent from the following detailed description of exemplary embodiments thereof, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the specification and together with the description, serve to explain the principles of the specification.
FIG. 1 is a block diagram of one example of a hardware configuration of an electronic device that may be used to implement an embodiment.
Fig. 2 is a block diagram of another example of a hardware configuration of an electronic device that can be used to implement another embodiment.
Fig. 3 shows a flowchart of a network address detection method of the first embodiment.
Fig. 4 shows a flowchart of a network address detection method of the second embodiment.
Fig. 5 shows a flowchart of a network address detection method of the third embodiment.
FIG. 6 illustrates a network address classification method according to an embodiment.
Fig. 7 shows a flowchart of an example of a method of detecting a network address.
Fig. 8 is a block diagram showing an example of the network address detection device.
Fig. 9 is a block diagram showing an example of the network address detection device.
Fig. 10 is a block diagram showing an example of the network address detection device.
FIG. 11 illustrates a block diagram of an electronic device of an embodiment.
Detailed Description
Various exemplary embodiments of the present specification will now be described in detail with reference to the accompanying drawings.
The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the specification, its application, or uses.
In all of the examples shown and discussed herein, any particular value is exemplary only and not limiting. Thus, the specific values of the exemplary embodiments may have different values in other examples.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.
< hardware configuration >
Fig. 1 and 2 are block diagrams of a hardware configuration of an electronic apparatus 1000 that can be used to implement the network address detection method of any embodiment of the present specification.
In one embodiment, as shown in FIG. 1, the electronic device 1000 may be a server 1100.
The server 1100 provides a service point for processes, databases, and communications facilities. The server 1100 can be a unitary server or a distributed server across multiple computers or computer data centers. The server may be of various types, such as, but not limited to, a web server, a news server, a mail server, a message server, an advertisement server, a file server, an application server, an interaction server, a database server, or a proxy server. In some embodiments, each server may include hardware, software, or embedded logic components or a combination of two or more such components for performing the appropriate functions supported or implemented by the server. For example, a server, such as a blade server, a cloud server, etc., or may be a server group consisting of a plurality of servers, which may include one or more of the above types of servers, etc.
In this embodiment, the server 1100 may include a processor 1110, a memory 1120, an interface device 1130, a communication device 1140, a display device 1150, and an input device 1160, as shown in fig. 1.
In this embodiment, the server 1100 may also include a speaker, a microphone, and the like, which are not limited herein.
The processor 1110 may be a dedicated server processor, or may be a desktop processor, a mobile version processor, or the like that meets performance requirements, and is not limited herein. The memory 1120 includes, for example, a ROM (read only memory), a RAM (random access memory), a nonvolatile memory such as a hard disk, and the like. The interface device 1130 includes, for example, various bus interfaces such as a serial bus interface (including a USB interface), a parallel bus interface, and the like. The communication device 1140 is capable of wired or wireless communication, for example. The display device 1150 is, for example, a liquid crystal display panel, an LED display panel touch panel, or the like. Input devices 1160 may include, for example, a touch screen, a keyboard, and the like.
In this embodiment, the memory 1120 of the server 1100 is configured to store instructions for controlling the processor 1110 to operate at least to perform the method for detecting a network address according to any embodiment of the present description. The skilled person can design the instructions according to the solution disclosed in the present specification. How the instructions control the operation of the processor is well known in the art and will not be described in detail herein.
Although shown as multiple devices of server 1100 in fig. 1, this description may refer to only some of the devices, e.g., server 1100 may refer to only memory 1120 and processor 1110.
In one embodiment, the electronic device 1000 may be a terminal device 1200 such as a PC, a notebook computer, or the like used by an operator, which is not limited herein.
In this embodiment, referring to fig. 2, the terminal apparatus 1200 may include a processor 1210, a memory 1220, an interface device 1230, a communication device 1240, a display device 1250, an input device 1260, a speaker 1270, a microphone 1280, and the like.
The processor 1210 may be a mobile version processor. The memory 1220 includes, for example, a ROM (read only memory), a RAM (random access memory), a nonvolatile memory such as a hard disk, and the like. The interface device 1230 includes, for example, a USB interface, a headphone interface, and the like. The communication device 1240 may be capable of wired or wireless communication, for example, the communication device 1240 may include a short-range communication device, such as any device that performs short-range wireless communication based on short-range wireless communication protocols, such as the Hilink protocol, wiFi (IEEE 802.11 protocol), mesh, bluetooth, zigBee, thread, Z-Wave, NFC, UWB, liFi, and the like, and the communication device 1240 may also include a long-range communication device, such as any device that performs WLAN, GPRS, 2G/3G/4G/5G long-range communication. The display device 1250 is, for example, a liquid crystal display, a touch panel, or the like. The input device 1260 may include, for example, a touch screen, a keyboard, and the like. A user can input/output voice information through the speaker 1270 and the microphone 1280.
In this embodiment, the memory 1220 of the terminal device 1200 is configured to store instructions for controlling the processor 1210 to operate at least to perform a method of detecting a network address according to any embodiment of the present description. The skilled person can design the instructions according to the solution disclosed in the present specification. How the instructions control the operation of the processor is well known in the art and will not be described in detail herein.
Although a plurality of devices of the terminal apparatus 1200 are shown in fig. 2, the present specification may refer to only some of the devices, for example, the terminal apparatus 1200 refers to only the memory 1220, the processor 1210 and the display device 1250.
< method examples >
Fig. 3 is a schematic flow chart of a method for detecting a network address according to an embodiment of the present disclosure.
In one example, the method shown in fig. 3 may be implemented by only the server or the terminal device, or may be implemented by both the server and the terminal device. In one embodiment, the terminal device may be the terminal device 1200 shown in fig. 2 and the server may be the server 1100 shown in fig. 1.
As shown in fig. 3, the method of the present embodiment includes the following steps S302 to S306:
step S302, behavior data of the user for executing the webpage operation is obtained.
Behavior data is data that characterizes the behavior of a user performing a web page operation. From this behavior data, at least the network address of the web page visited by the user, and the network address of the previous web page, can be determined. In this embodiment, behavior data of web page operations performed by a plurality of users may be acquired.
In one or more embodiments of the present description, a buried point may be set in an application platform populated with an H5 page or an applet of an external merchant in advance to obtain behavior data of a plurality of users performing web page operations in the application platform. The behavior data may reflect a trajectory of a user accessing a web page in the application platform.
In one or more embodiments of the present description, the behavior data may be acquired at a set sampling frequency in a set period. The setting period may be set in advance according to an application scenario or a specific requirement, for example, the setting period may be 1 day. The set sampling frequency may be set in advance according to an application scenario or specific requirements, for example, the set sampling frequency may be 1 minute/time. Then, the behavior data of the user performing the web page operation may be acquired every 1 minute within 1 day.
Step S304, extracts the network address included in the behavior data.
The network address may be a Uniform Resource Locator (URL), which is a compact representation of the location and access method of a Resource available from the internet and is an address of a standard Resource on the internet.
The network address contained in the behavior data is the network address that the user has accessed. The number of the network addresses extracted in this step may be one or more.
Step S306, the availability detection is carried out on the network address.
Specifically, the availability of the network address may be detected by simulating to access the network address to detect whether the network address is available.
In one embodiment, the availability of network addresses of web pages that are not pre-configured in the application platform but are accessible from the application platform may be detected, which may improve the reliability of network address detection.
Moreover, the network addresses needing to be detected can be obtained for availability detection without setting buried points for the network addresses in the internal codes corresponding to the H5 pages or the applets by merchants who are resident in the application platform, so that the availability detection process of the network addresses can be further simplified, the discovery rate of the situation that the network addresses accessed through the application platform are unavailable can be improved, and the user experience is improved.
In one or more embodiments of the present specification, if a plurality of network addresses are extracted in step S304, then the method may further include steps S402 to S406 as shown in fig. 4:
step S402, classifying the network addresses according to the association degree among the network addresses.
In one or more embodiments of the present specification, the association degree may be a parameter representing a degree of association between network addresses, that is, a parameter representing an association degree of URLs of two network addresses.
In the first embodiment of the present specification, the association degree between every two network addresses may be determined, and two network addresses with the association degree greater than or equal to a preset association degree threshold value may be classified into the same category.
In one or more embodiments of the present disclosure, the relevance threshold may be set in advance according to an application scenario or a specific requirement. Different relevancy thresholds may be set for different application scenarios. For example, the threshold of the degree of association may be, but is not limited to, 0.8, and then two network addresses with the degree of association greater than or equal to 0.8 may be classified into the same category.
In one or more embodiments of the present description, the method may further include: and extracting jump relations among the network addresses existing in the behavior data, wherein the jump relations represent network addresses jumping to another webpage through the network addresses of the webpage.
Then, classifying the network address may further include:
and classifying the network addresses according to the association degree of the network addresses with the jump relation.
In the second embodiment of the present specification, classifying the network addresses according to the association degree between the network addresses having the jump relationship may include steps S502 to S510 shown in fig. 5:
step S502, each target network address is set as a seed network address. Wherein the destination network address comprises at least an initial address.
Step S504, all the lower-level network addresses of the seed network address are obtained according to the jump relation.
In one embodiment, the lower level network address may be a network address that can be opened after jumping via the network address.
For example, the network address of the next web page corresponding to the seed network address URL1 in the jump relationship is URL1.1, and the network address of the next web page corresponding to the network address URL1.1 in the jump relationship is URL1.1.1, then the network address URL1.1 and the network address URL1.1.1 are both the lower-level network addresses of the seed network address URL1.
In another embodiment, the lower level network address may also be the network address of the next web page corresponding to the seed network address in the jump relationship.
For example, as shown in fig. 6, the network address of the next web page corresponding to the seed network address URL1 in the jump relation includes URL1.1, URL1.2, and URL1.3, the network address of the next web page corresponding to the network address URL1.1 in the jump relation includes URL1.1.1, URL1.1.2, and URL1.1.3, and the network address of the next web page corresponding to the network address URL1.2 includes URL1.2.1 and URL1.2.2, so the network addresses include URL1.1, URL1.2, and URL1.3 as the next network address of the seed network address URL1, and the network addresses URL1.1.1, URL1.1.2, URL1.1.3, URL1.2.1, and URL1.2.2 are not the next network address of the seed network address URL 321.
Step S506, according to the association degree between the seed network address and each lower-level network address, obtaining the lower-level network address belonging to the same service object as the seed network address, and forming service association data.
Specifically, the lower-level network address whose association with the seed network address is greater than or equal to a preset association threshold may be used as the lower-level network address belonging to the same service object as the seed network address.
On this basis, the method may further include a step of determining a degree of association between the seed network address and each lower-level network address, specifically including steps S602 to S604 shown below:
step S602, determining a word frequency-inverse text frequency index value of each lower-level network address.
In this embodiment, the term frequency-inverse text frequency index is the TF-IDF value. TF-IDF (term frequency-inverse document frequency index) is a commonly used weighting technique for information retrieval and data mining.
TF-IDF index, the importance of a lower network address to the corresponding seed network address or one of all seed network addresses can be evaluated. The importance of a lower network address increases in proportion to the number of times it appears in the corresponding seed network address, but at the same time decreases in inverse proportion to the frequency with which it appears in all seed network addresses. The word frequency (TF) represents the frequency of the lower network address appearing in the corresponding seed network address, and the larger the TF, the greater the importance of the lower network address in the corresponding seed network address. The inverse text frequency Index (IDF) indicates the frequency of occurrence of the lower network address in all seed network addresses, and the larger the IDF, the lower the importance of the lower network address.
And if the frequency TF of the appearance of a lower-level network address in the corresponding seed network address is high and the lower-level network address rarely appears in other seed addresses, the lower-level network address and the corresponding seed network address are considered to have higher association degree and belong to the same service object.
In one embodiment, the word frequency and the inverse text frequency index of each lower-level network address are respectively calculated, and then the product of the word frequency and the inverse text frequency index of each lower-level network address is respectively calculated as the word frequency-inverse text frequency index value of the corresponding lower-level network address.
The manner of determining the word frequency-inverse text frequency index value of each lower level network address may include:
taking each lower-level network address as a current network address in turn; respectively determining word frequency and an inverse text frequency index of the current network address; and determining the word frequency-inverse text frequency index value of the current network address according to the word frequency and the inverse text frequency index of the current network address.
Specifically, the word frequency of the current network address may be calculated in the following manner: determining the times of the current network address appearing in the seed network address as a first time; determining the occurrence times of all the lower-level network addresses in the seed network address as a second time; and determining a first ratio of the first frequency to the second frequency as the word frequency of the current network address. The method for calculating the reverse text frequency index of the current network address may be: determining the total number of all seed network addresses, and determining the times of the current network address appearing in all seed network addresses as a third time; determining a second ratio of the total number of all seed network addresses to the third number of times; and calculating the logarithm of the second ratio with the base 10 as the inverse text frequency index of the current network address.
For example, if the total number of all the lower-level network addresses of the seed network address URL1 is N1, and the number of occurrences of the current network address URL1.1 is N2, the word frequency of the current network address URL1.1 in the seed network address URL1 is N2/N1.
If the number of occurrences of the current network address URL1.1 in all seed network addresses is M1 and the total number of seed network addresses is M2, then the reverse text frequency index of the current network address URL1.1 may be lg (M2/M1).
Then, the TF-IDF value of the current network address URL1.1 may be the product of the corresponding word frequency and the inverse text frequency index, i.e. N2/N1 × lg (M2/M1).
In another embodiment, the TF-IDF value of each lower network address can also be determined by training the obtained TF-IDF model in advance.
Step S604, according to the word frequency-inverse text frequency index value of each lower-level network address, determining the association degree between the corresponding lower-level network address and the seed network address.
In one embodiment, the TF-IDF value of each lower network address may be used as the association between the corresponding lower network address and the seed network address.
For example, for the lower level network address URL1.1 of the seed network address URL1, the TF-IDF value of the lower level network address URL1.1 is calculated in step S602, and the TF-IDF value can be used as the association degree between the lower level network address URL1.1 and the seed network address URL1.
Step S508, the lower level network address belonging to the same service object as the seed network address is used as the target network address.
In this embodiment, the lower network address belonging to the same service object as the seed network address in each iteration process can be obtained by an iteration process in which the lower network address belonging to the same service object as the seed network address is used as the target network address.
Further, the iteration process of using the lower-level network address belonging to the same service object as the seed network address as the target network address may be ended when the number of iterations reaches the preset number of iterations. Or ending the iterative process of using the lower network address belonging to the same service object as the seed network address as the target network address under the condition that the lower network address belonging to the same service object as the seed network address does not exist in the iterative process.
Step S510, classifying the network addresses according to the service association data corresponding to each network address.
Specifically, the network addresses belonging to the same service object may be classified into the same category.
In the example shown in fig. 6, in the first iteration process, the initial address URL1 may be used as a seed network address, and the lower-level network address that belongs to the same service object as the seed network address URL1 includes URL1.1 and URL1.2. In the second iteration process, the lower-level network addresses URL1.1 and URL1.2 are respectively used as seed network addresses, the lower-level network addresses which belong to the same service object with the seed network address URL1.1 are obtained to comprise URL1.1.1 and URL1.1.3, and the lower-level network addresses which belong to the same service object with the seed network address URL1.2 are obtained to comprise URL1.2.1. In the third iteration process, the lower-level network addresses URL1.1.1, URL1.1.3 and URL1.2.1 are respectively used as seed network addresses, the lower-level network address belonging to the same service object as the seed network address URL1.1.1 is not obtained, the lower-level network address belonging to the same service object as the seed network address URL1.1.3 is URL1.1.3.1, and the lower-level network address belonging to the same service object as the seed network address URL1.2.1 is not obtained. In the fourth iteration process, the lower-level network address URL1.1.3.1 is used as the seed network address, the lower-level network address which belongs to the same service object as the seed network address URL1.1.3.1 is not obtained, and the iteration process is ended.
The lower level network address belonging to the same business object as the initial address URL1 includes network addresses URL1.1 and URL1.2, the lower level network address belonging to the same business object as the network address URL1.1 includes network addresses URL1.1.1 and URL1.1.3, the lower level network address belonging to the same business object as the network address URL1.2 includes a network address URL1.2.1, and the lower level network address belonging to the same business object as the network address URL1.1.3 is URL1.1.3.1, so that the initial address URL1, the network address URL1.1, the network address URL1.2, the network address URL1.1.1, the network address URL1.1.3, the network address URL1.2.1 and the network address URL1.1.3.1 belong to the same category.
In the third embodiment of the present specification, an initial address corresponding to each business object may be configured in advance, and then, by classifying network addresses, each classification includes an initial address.
Specifically, each target network address is set as a seed network address, wherein the target network address at least comprises an initial address; according to the association degree between the seed network address and other network addresses, obtaining other network addresses of which the seed network addresses belong to the same service object, and forming service associated data; wherein the other network addresses are network addresses other than the seed network address; taking other network addresses which belong to the same service object with the seed network address as target network addresses; and classifying the network addresses according to the service associated data corresponding to each network address.
The implementation manner of each step in this embodiment may specifically refer to the second embodiment described above, and is not described herein again.
Step S404, according to the network address included in each classification, determining the business object to which the corresponding classification belongs.
In one or more embodiments of the present specification, the initial address corresponding to each business object may be configured in advance, and then determining the business object to which the corresponding classification belongs according to the network address included in each classification may include:
searching a preset initial address contained in each classification; and taking the service object to which the known initial address belongs as the service object to which the corresponding classification belongs.
For example, if the initial address of the service object a configured on the application platform is URL1, then the classification including URL1 includes network addresses URL1.1 to URL1.6, and then the service objects to which the network addresses URL1.1 to URL1.6 belong are all the service objects a.
Step S406, obtaining the service object to which the abnormal network address detected by the availability detection belongs according to the service object to which each classification belongs.
Specifically, in the process of detecting the availability of the network address, if an abnormal network address is detected, the abnormal service object to which the abnormal network address belongs may be obtained according to the service object to which the classification including the abnormal network address belongs, and the abnormal service object may be prompted to maintain the abnormal network address.
In an embodiment, the test method can remind the abnormal service object to which the abnormal network address belongs to maintain the abnormal network address, so that the abnormal network address can be repaired in time.
In one or more embodiments of the present description, the method may further include:
and under the condition that the abnormal network address is detected, displaying the abnormal network address and the abnormal business object to which the abnormal network address belongs.
In one or more embodiments of the present description, the method may further include: extracting the jump relation between network addresses existing in the behavior data; and obtaining the associated data among the network addresses according to the jumping relation.
On this basis, the detecting the availability of the network address may further include:
and according to the associated data, carrying out availability detection on the network address.
In this embodiment, the association data may be service association data corresponding to each network address in the foregoing embodiments. The associated data may be data of a tree structure as shown in fig. 6.
In one or more embodiments of the present description, the detecting availability of the network address based on the association data may include:
determining the detection frequency of each network address according to the associated data; and according to the detection frequency, carrying out availability detection on the corresponding network address.
It may be that the level of the network data at the upper layer is higher than that of the network data at the lower layer in the tree structure of the associated data, and correspondingly, the detection frequency of the network data at the upper layer is higher than that of the network data at the lower layer in the tree structure of the associated data. Therefore, the availability detection frequency of the upper-level network address can be improved, and the jump to the corresponding lower-level network address can be guaranteed.
Specifically, the detection frequency corresponding to each level may be preset, and the detection frequency corresponding to each network address may be obtained according to the level corresponding to each network address in the associated data.
For example, in the tree structure shown in fig. 6, the network address URL1 is ranked highest, followed by network addresses URL1.1 and URL1.2, followed by network addresses URL1.1.1, URL1.1.3 and URL1.2.1, and finally by network address URL1.1.3.1. Then, it can be determined that the detection frequency of the network address URL1 is a first frequency, the detection frequencies of the network addresses URL1.1 and URL1.2 are a second frequency, the detection frequencies of the network addresses URL1.1.1, URL1.1.3 and URL1.2.1 are a third frequency, and the detection frequency of the network address URL1.1.3.1 is a fourth frequency, wherein the first frequency ≧ the second frequency ≧ the third frequency ≧ the fourth frequency.
In one or more embodiments of the present description, the method may further include:
according to the behavior data, the jump probability of each network address is determined; determining the detection frequency of the corresponding network address according to the skipping probability; and according to the detection frequency, carrying out availability detection on the corresponding network address.
Specifically, the access times of each network address may be obtained according to the behavior data. According to the access times of each network address, the jump probability of each network address can be obtained. The hop probability of each network address may be a ratio between the number of accesses of the corresponding network address and the number of accesses of all network addresses. The hop probability can embody the probability of the user accessing the corresponding network address, and therefore, the detection frequency of the corresponding network address can be set according to the hop probability.
For example, for m network addresses URL1 to URLm, the number of accesses to network address URL1 is N1, the number of accesses to network address URL2 is N2, … …, and the number of accesses to network address URLm is Nm, then the probability of jumping for the ith network address URLi may be Nm
Figure BDA0002176492810000171
In this embodiment, a comparison table reflecting the correspondence between the hop probability range and the detection frequency may be preset, so that the corresponding detection frequency is higher when the hop probability is higher. By looking up the lookup table, the detection frequency of each network address can be determined, and the availability detection is performed on each network address according to the detection frequency.
In one embodiment, the testing method can improve the availability detection frequency of the commonly used network addresses so as to ensure that the web pages with high access times are available for users.
< example 1>
The following describes a process implemented by the network address detection method in this embodiment by using a specific example.
Step S702, behavior data of the user executing the web page operation is obtained.
Step S704, extracts the network addresses existing in the behavior data and the jump relationship between the network addresses.
Step S706, setting each target network address as a seed network address, where the target network address at least includes a preset initial address.
Step S708, according to the jump relation, all the lower level network addresses of the seed network address are obtained.
Step S710, determining a word frequency-inverse text frequency index value of each lower-level network address.
Step S712, determining the association degree between the corresponding lower-level network address and the seed network address according to the word frequency-inverse text frequency index value of each lower-level network address.
Step S714, according to the association degree between the seed network address and each lower level network address, obtaining the lower level network address belonging to the same service object as the seed network address, and forming service association data.
Step S716, the lower level network address belonging to the same service object as the seed network address is used as the target network address.
Step S718, classifying the network addresses according to the service association data corresponding to each network address.
Step S720, using the service object to which the known initial address belongs as the service object to which the corresponding classification belongs.
In step S722, when the abnormal network address is detected, the abnormal service object to which the abnormal network address belongs is identified.
Step S724, alarming and prompting the abnormal business object, and displaying the abnormal network address and the abnormal business object.
< apparatus >
In this embodiment, a network address detection device 8000 is provided. As shown in fig. 8, the apparatus 8000 for detecting the network address includes a data obtaining module 8100, an address extracting module 8200, and an availability detecting module 8300. The data acquisition module 8100 is used for acquiring behavior data of a user for executing webpage operation; the address extraction module 8200 is used for extracting a network address contained in the behavior data; the availability detection module 8300 is used for performing availability detection on the network address.
In one or more embodiments of the present description, the apparatus 8000 may also include a classification module 8400, an object determination module 8500, and an anomaly determination module 8600, as shown in FIG. 9. The classification module 8400 is configured to classify the network addresses according to the association degrees between the network addresses; the object determining module 8500 is configured to determine, according to the network address included in each category, a service object to which the corresponding category belongs; the anomaly determination module 8600 is configured to obtain, according to the service object to which each category belongs, a service object to which an abnormal network address detected by the availability detection belongs.
In one or more embodiments of the present description, the object determination module 8500 may further be configured to:
searching a preset initial address contained in each classification; and
and taking the service object to which the known initial address belongs as the service object to which the corresponding classification belongs.
In one or more embodiments of the present specification, the apparatus 8000 may further include a jump relation extracting module 8700 as shown in fig. 10, configured to extract a jump relation between network addresses existing in the behavior data, where the jump relation indicates a network address jumping to another web page via a network address of a web page. Wherein, the classification module 8400 may also be configured to:
and classifying the network addresses according to the association degree of the network addresses with the jump relation.
In one or more embodiments of the present specification, the classification module 8400 may further include a seed address setting unit 8410, a lower address acquisition unit 8420, an associated data obtaining unit 8430, a target address setting unit 8440, and a classification unit 8450 as illustrated in fig. 10. The seed address setting unit 8410 is configured to set each target network address as a seed network address, where the target network address at least includes an initial address; the lower address obtaining unit 8420 is configured to obtain all lower network addresses of the seed network address according to the jump relationship; the associated data obtaining unit 8430 is configured to obtain, according to the association degree between the seed network address and each lower-level network address, a lower-level network address that belongs to the same service object as the seed network address, and form service associated data; the destination address setting unit 8440 is configured to use a lower-level network address that belongs to the same service object as the seed network address as the destination network address; the classifying unit 8450 is configured to classify the network addresses according to the service association data corresponding to each network address.
In one or more embodiments of the present description, the association data obtaining unit 8430 may further be configured to:
determining a word frequency-inverse text frequency index value of each lower-level network address;
and determining the association degree between the corresponding lower-level network address and the seed network address according to the word frequency-inverse text frequency index value.
In one or more embodiments of the present description, determining the word frequency-inverse text frequency index value for each of the subordinate network addresses comprises:
taking each lower-level network address as a current network address in turn;
determining the times of the current network address appearing in the seed network address as a first time;
determining the occurrence times of all the lower-level network addresses in the seed network address as a second time;
determining the word frequency of the current network address according to the first times and the second times;
determining the occurrence times of the current network address in all the seed network addresses as a third time;
determining the total number of all seed network addresses;
determining the reverse text frequency index of the current network address according to the total number and the third times of all the network addresses;
and obtaining the word frequency-inverse text frequency index value of the current network address according to the word frequency and the inverse text frequency index.
In one or more embodiments of the present description, the apparatus 8000 may also include:
a module for determining an abnormal service object to which the abnormal network address belongs in the case of the detected abnormal network address; and
and the module is used for alarming and reminding the abnormal business object.
In one or more embodiments of the present description, the apparatus 8000 further includes:
a module for extracting jump relationships between network addresses present in the behavior data, wherein a jump relationship represents a network address jumping to a web page via a network address of another web page, and
a module for obtaining the associated data between the network addresses according to the jump relation;
wherein the availability detection module 8300 is further operable to:
and the module is used for carrying out availability detection on the network address according to the associated data.
In one or more embodiments of the present description, detecting availability of a network address based on the association data comprises:
determining the detection frequency of each network address according to the associated data; and
and according to the detection frequency, carrying out availability detection on the corresponding network address.
In one or more embodiments of the present description, the apparatus 8000 may further include:
means for determining a hop probability for each network address based on the behavior data;
a module for determining the detection frequency of the corresponding network address according to the jump probability; and
and the module is used for carrying out availability detection on the corresponding network address according to the detection frequency.
In one or more embodiments of the present description, the apparatus 8000 may further include:
and the module is used for displaying the abnormal network address under the condition that the abnormal network address is detected.
It will be appreciated by those skilled in the art that the means 8000 for detecting a network address can be implemented in a variety of ways. The means 8000 for detecting network addresses may be implemented, for example, by an instruction configuration processor. For example, the instructions may be stored in ROM and read from ROM into a programmable device to implement the network address detection means 8000 when the device is started. For example, the network address detection device 8000 may be incorporated into a dedicated device (e.g., a processor). The network address detection means 8000 may be divided into units independent of each other, or they may be integrated together. The network address detection means 8000 may be implemented by one of the various implementations described above, or may be implemented by a combination of two or more of the various implementations described above.
In this embodiment, the network address detection device 8000 may have various implementations, for example, the network address detection device 8000 may be any functional module running in a software product or application providing a network address detection function, or a peripheral insert, a plug-in, a patch, etc. of the software product or application, or the software product or application itself.
< electronic apparatus >
In this embodiment, an electronic device 9000 is also provided. The electronic device 9000 can comprise a server 1100 as shown in fig. 1. The electronic device 9000 may also be a terminal device 1200 as shown in fig. 2.
In one aspect, the electronic device 9000 may comprise the aforementioned means for detecting a network address 8000 for implementing the methods of any of the embodiments herein.
In another aspect, as shown in fig. 11, an electronic device 9000 can further comprise a processor 9100 and a memory 9200, the memory 9200 to store executable instructions; the processor 9100 is configured to operate the electronic device 9000 to perform a method of detecting a network address according to any embodiment of the present specification, according to control of instructions.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and the description of each embodiment is different from the description of the other embodiments. In particular, the description of the apparatus embodiment and the electronic device embodiment is relatively simple because it is substantially similar to the method embodiment, and reference may be made to some description of the method embodiment for relevant points.
The present description may be a method and/or a computer program product. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied therewith for causing a processor to implement various aspects of the present description.
The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be interpreted as a transitory signal per se, such as a radio wave or other freely propagating electromagnetic wave, an electromagnetic wave propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or an electrical signal transmitted through an electrical wire.
The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device over a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives the computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.
Computer program instructions for carrying out operations of the present specification may be assembler instructions, instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present description are implemented by personalizing an electronic circuit, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA), with state information of computer-readable program instructions, the electronic circuit being operable to execute the computer-readable program instructions.
Aspects of the present description are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the description. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.
These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present description. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. It is well known to those skilled in the art that implementation by hardware, by software, and by a combination of software and hardware are equivalent.
The foregoing description of the embodiments of the present specification has been presented for purposes of illustration and description, but is not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein. The scope of the present description is defined by the appended claims.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or order of connection, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

Claims (19)

1. A method for detecting a network address comprises the following steps:
acquiring behavior data of a user for executing webpage operation;
extracting a network address contained in the behavior data; the network address at least comprises a network address of a webpage accessed by a user and a network address of a previous webpage;
performing availability detection on the network address;
the method further comprises the following steps:
extracting a jump relation between network addresses existing in the behavior data, wherein the jump relation represents a network address jumping to another web page via a network address of one web page;
classifying the network addresses according to the association degree between the network addresses with the jump relation;
determining a business object to which the corresponding classification belongs according to the network address contained in each classification; and
obtaining a service object to which the abnormal network address detected by the availability detection belongs according to the service object to which each classification belongs;
the classifying the network address comprises:
setting each target network address as a seed network address, wherein the target network addresses at least comprise initial addresses;
acquiring all lower-level network addresses of the seed network address according to the jump relation;
according to the association degree between the seed network address and each lower-level network address, obtaining the lower-level network address which belongs to the same service object with the seed network address to form service association data;
taking the lower-level network address which belongs to the same service object with the seed network address as a target network address;
and classifying the network addresses according to the service associated data corresponding to each network address.
2. The method of claim 1, the determining the business object to which the corresponding classification belongs comprising:
searching a preset initial address contained in each classification; and
and taking the known business object to which the initial address belongs as the business object to which the corresponding classification belongs.
3. The method of claim 1, classifying the network address further comprising:
determining a word frequency-inverse text frequency index value of each lower-level network address;
and determining the association degree between the corresponding lower-level network address and the seed network address according to the word frequency-inverse text frequency index value.
4. The method of claim 3, determining a word frequency-inverse text frequency index value for each of said lower level network addresses comprising:
taking each lower-level network address as a current network address in turn;
determining the times of the current network address appearing in the seed network address as a first time;
determining the occurrence times of all the lower-level network addresses in the seed network address as a second time;
determining the word frequency of the current network address according to the first times and the second times;
determining the occurrence times of the current network address in all seed network addresses as a third time;
determining the total number of all seed network addresses;
determining the reverse text frequency index of the current network address according to the total number of all the seed network addresses and the third times;
and obtaining the word frequency-inverse text frequency index value of the current network address according to the word frequency and the inverse text frequency index.
5. The method of claim 1, further comprising:
under the condition that an abnormal network address is detected, determining an abnormal business object to which the abnormal network address belongs; and
and alarming to remind the abnormal business object.
6. The method of claim 1, further comprising:
extracting a jump relationship between network addresses present in the behavior data, wherein the jump relationship represents a jump to a network address of one web page via a network address of another web page, an
Obtaining associated data among network addresses according to the jump relation;
wherein the detecting of the availability of the network address comprises:
and carrying out availability detection on the network address according to the associated data.
7. The method of claim 6, wherein detecting the availability of the network address based on the association data comprises:
determining the detection frequency of each network address according to the associated data; and
and according to the detection frequency, carrying out availability detection on the corresponding network address.
8. The method of claim 1, further comprising:
determining the jump probability of each network address according to the behavior data;
determining the detection frequency of the corresponding network address according to the jump probability; and
and carrying out availability detection on the corresponding network address according to the detection frequency.
9. The method of claim 1, further comprising:
and displaying the abnormal network address under the condition that the abnormal network address is detected.
10. An apparatus for detecting a network address, comprising:
the data acquisition module is used for acquiring behavior data of webpage operation executed by a user;
the address extraction module is used for extracting a network address contained in the behavior data; the network address at least comprises a network address of a webpage accessed by a user and a network address of a previous webpage;
the availability detection module is used for carrying out availability detection on the network address;
a jump relation extracting module, configured to extract a jump relation between network addresses existing in the behavior data, where the jump relation indicates a jump to a network address of another web page via a network address of a web page;
the classification module is used for classifying the network addresses according to the association degree between the network addresses with the jump relation;
the object determining module is used for determining the business object to which the corresponding classification belongs according to the network address contained in each classification; and
an anomaly determination module, configured to obtain, according to the service object to which each of the classifications belongs, a service object to which an abnormal network address detected by the availability detection belongs;
the classifying the network address comprises:
setting each target network address as a seed network address, wherein the target network address at least comprises an initial address;
acquiring all lower-level network addresses of the seed network address according to the jump relation;
according to the association degree between the seed network address and each lower-level network address, obtaining the lower-level network address which belongs to the same service object with the seed network address to form service association data;
taking the lower-level network address which belongs to the same service object with the seed network address as a target network address;
and classifying the network addresses according to the service associated data corresponding to each network address.
11. The apparatus of claim 10, the object determination module further to:
searching a preset initial address contained in each classification; and
and taking the known business object to which the initial address belongs as the business object to which the corresponding classification belongs.
12. The apparatus of claim 10, the classification module comprising: an association data obtaining unit, the association data obtaining unit further configured to:
determining a word frequency-inverse text frequency index value of each lower-level network address;
and determining the association degree between the corresponding lower-level network address and the seed network address according to the word frequency-inverse text frequency index value.
13. The apparatus of claim 12, determining a word frequency-inverse text frequency index value for each of said lower level network addresses comprises:
taking each lower-level network address as a current network address in turn;
determining the times of the current network address appearing in the seed network address as a first time;
determining the occurrence times of all the lower-level network addresses in the seed network address as a second time;
determining the word frequency of the current network address according to the first times and the second times;
determining the occurrence times of the current network address in all seed network addresses as a third time;
determining the total number of all seed network addresses;
determining the reverse text frequency index of the current network address according to the total number of all the seed network addresses and the third times;
and obtaining the word frequency-inverse text frequency index value of the current network address according to the word frequency and the inverse text frequency index.
14. The apparatus of claim 10, further comprising:
the module is used for determining an abnormal business object to which the abnormal network address belongs under the condition that the abnormal network address is detected; and
and the module is used for alarming and reminding the abnormal business object.
15. The apparatus of claim 10, further comprising:
means for extracting a jump relationship between network addresses present in the behavior data, wherein the jump relationship represents a jump to a network address of a web page via a network address of another web page, an
A module for obtaining the associated data between the network addresses according to the jump relation;
wherein the detecting of the availability of the network address comprises:
and the module is used for carrying out availability detection on the network address according to the associated data.
16. The apparatus of claim 15, the detecting availability of the network address according to the association data comprising:
determining the detection frequency of each network address according to the associated data; and
and carrying out availability detection on the corresponding network address according to the detection frequency.
17. The apparatus of claim 10, further comprising:
means for determining a hop probability for each network address based on the behavior data;
a module for determining the detection frequency of the corresponding network address according to the jump probability; and
and the module is used for carrying out availability detection on the corresponding network address according to the detection frequency.
18. The apparatus of claim 10, further comprising:
means for presenting an abnormal network address if the abnormal network address is detected.
19. An electronic device, comprising:
the apparatus of any one of claims 10 to 18; alternatively, the first and second electrodes may be,
a processor and a memory for storing executable instructions for controlling the processor to perform the method of any one of claims 1 to 9.
CN201910780723.0A 2019-08-22 2019-08-22 Network address detection method and device and electronic equipment Active CN110740074B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910780723.0A CN110740074B (en) 2019-08-22 2019-08-22 Network address detection method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910780723.0A CN110740074B (en) 2019-08-22 2019-08-22 Network address detection method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN110740074A CN110740074A (en) 2020-01-31
CN110740074B true CN110740074B (en) 2023-04-18

Family

ID=69267764

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910780723.0A Active CN110740074B (en) 2019-08-22 2019-08-22 Network address detection method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN110740074B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106326485A (en) * 2016-09-05 2017-01-11 郑州悉知信息科技股份有限公司 Method for detecting web link and device thereof
CN108304410A (en) * 2017-01-13 2018-07-20 阿里巴巴集团控股有限公司 A kind of detection method, device and the data analysing method of the abnormal access page

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104462183B (en) * 2014-10-10 2019-03-26 北京国双科技有限公司 Webpage jumps processing method and processing device
CN104766014B (en) * 2015-04-30 2017-12-01 安一恒通(北京)科技有限公司 For detecting the method and system of malice network address
CN107154959A (en) * 2016-03-02 2017-09-12 阿里巴巴集团控股有限公司 A kind of method and apparatus of the access network address
CN106874474A (en) * 2017-02-16 2017-06-20 维沃移动通信有限公司 A kind of invalid web pages processing method of web page storage, server and terminal
CN108469979A (en) * 2018-03-28 2018-08-31 深圳前海桔子信息技术有限公司 A kind of method for page jump, device, server and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106326485A (en) * 2016-09-05 2017-01-11 郑州悉知信息科技股份有限公司 Method for detecting web link and device thereof
CN108304410A (en) * 2017-01-13 2018-07-20 阿里巴巴集团控股有限公司 A kind of detection method, device and the data analysing method of the abnormal access page

Also Published As

Publication number Publication date
CN110740074A (en) 2020-01-31

Similar Documents

Publication Publication Date Title
KR102613774B1 (en) Systems and methods for extracting and sharing application-related user data
US20160241589A1 (en) Method and apparatus for identifying malicious website
CN104899220B (en) Application program recommendation method and system
WO2019061443A1 (en) Notification display method and terminal
US20180248879A1 (en) Method and apparatus for setting access privilege, server and storage medium
US20130290347A1 (en) Systems and methods for providing data-driven document suggestions
CN107872534B (en) Information pushing method and device, server and readable storage medium
CN107609122B (en) Advertisement shielding rule updating method, device, server and storage medium
CN111314063A (en) Big data information management method, system and device based on Internet of things
US11616860B2 (en) Information display method, terminal, and server
US20160004703A1 (en) Methods for modifying and ranking searches with actions based on prior search results and actions
CN110083677B (en) Contact person searching method, device, equipment and storage medium
CN103678706A (en) Picture recognition method, system, equipment and device based on screenshot information
CN106549860B (en) Information acquisition method and device
CN110740074B (en) Network address detection method and device and electronic equipment
US10775966B2 (en) Customizable autocomplete option
CN114265777B (en) Application program testing method and device, electronic equipment and storage medium
CN106559554A (en) A kind of communication processing method, device
CN111385375B (en) Method and equipment for generating email address
CN109656659B (en) Behavior event processing method and device, electronic equipment and readable storage medium
CN108415957B (en) Method and device for self-defined navigation of webpage
JP2012194783A (en) Server to be used in application market, communication terminal, system and gui determination method
US10176248B2 (en) Performing a dynamic search of electronically stored records based on a search term format
CN111428491B (en) Merging method and device of character streams and electronic equipment
KR102072391B1 (en) Object recongnition method and system using touch screen

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20200921

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Innovative advanced technology Co.,Ltd.

Address before: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant before: Advanced innovation technology Co.,Ltd.

Effective date of registration: 20200921

Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands

Applicant after: Advanced innovation technology Co.,Ltd.

Address before: A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands

Applicant before: Alibaba Group Holding Ltd.

GR01 Patent grant
GR01 Patent grant