CN109743309B - Illegal request identification method and device and electronic equipment - Google Patents

Illegal request identification method and device and electronic equipment Download PDF

Info

Publication number
CN109743309B
CN109743309B CN201811624702.1A CN201811624702A CN109743309B CN 109743309 B CN109743309 B CN 109743309B CN 201811624702 A CN201811624702 A CN 201811624702A CN 109743309 B CN109743309 B CN 109743309B
Authority
CN
China
Prior art keywords
illegal
url
similarity
target
urls
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811624702.1A
Other languages
Chinese (zh)
Other versions
CN109743309A (en
Inventor
王嘉伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Weimeng Chuangke Network Technology China Co Ltd
Original Assignee
Weimeng Chuangke Network Technology China Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Weimeng Chuangke Network Technology China Co Ltd filed Critical Weimeng Chuangke Network Technology China Co Ltd
Priority to CN201811624702.1A priority Critical patent/CN109743309B/en
Publication of CN109743309A publication Critical patent/CN109743309A/en
Application granted granted Critical
Publication of CN109743309B publication Critical patent/CN109743309B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The application discloses an illegal request identification method which is used for solving the problem that in the prior art, the accuracy rate of illegal request identification is low. The method comprises the following steps: after a data access request is received, determining a target Uniform Resource Locator (URL) of the request; determining the similarity between the target URL and a predetermined target illegal URL, wherein the target illegal URL is a part or all of illegal URLs, and the illegal URLs are URLs which are requested to be accessed by illegal users; and if the similarity meets a first preset condition, determining that the data access request is an illegal request. The application also discloses an illegal request recognition device, electronic equipment and a computer readable storage medium.

Description

Illegal request identification method and device and electronic equipment
Technical Field
The present application relates to the field of computer technologies, and in particular, to an illegal request identification method and apparatus, an electronic device, and a computer-readable storage medium.
Background
In computer technology, internet websites can provide rich information for users, and information dissemination is greatly promoted. However, some illegal users may use machines to simulate normal users to send web page access requests to the web site for various reasons, and the amount of such requests is often large and frequent, which may have a negative impact on the web site server, and thus, it is necessary to identify and prevent the illegal requests.
In the prior art, when an illegal request is identified, the number of times of access of each Internet Protocol (IP) address in a past period of time is counted, and for an IP with an excessively high number of times of access, a user using the IP is considered as an illegal user, and the IP is subjected to a blocking process to prevent the access request sent under the IP.
However, in the prior art, if an illegal user frequently changes an IP address and sends an access request through a different IP address, the illegal request cannot be accurately identified.
Disclosure of Invention
The embodiment of the application provides an illegal request identification method, which is used for solving the problem of low accuracy of illegal request identification in the prior art.
The embodiment of the application also provides an illegal request identification device, which is used for solving the problem of low accuracy rate of illegal request identification in the prior art.
The embodiment of the application further provides electronic equipment, which is used for solving the problem that in the prior art, the accuracy rate of illegal request identification is low.
The embodiment of the application also provides a computer-readable storage medium, which is used for solving the problem of low accuracy of illegal request identification in the prior art.
The embodiment of the application adopts the following technical scheme:
an illegal request identification method, comprising:
after a data access request is received, determining a target Uniform Resource Locator (URL) of the request;
determining the similarity between the target URL and a predetermined target illegal URL, wherein the target illegal URL is a part or all of illegal URLs, and the illegal URLs are URLs which are requested to be accessed by illegal users;
and if the similarity meets a first preset condition, determining that the data access request is an illegal request.
An illegal request recognition device comprising:
the target URL determining unit is used for determining a target Uniform Resource Locator (URL) of a request after receiving a data access request;
the first similarity determining unit is used for determining the similarity between the target URL and a predetermined target illegal URL, wherein the target illegal URL is a part or all of illegal URLs, and the illegal URLs are URLs which are requested to be accessed by illegal users;
and the illegal request determining unit is used for determining that the data access request is an illegal request if the similarity meets a first preset condition.
An electronic device, comprising:
a processor; and
a memory arranged to store computer executable instructions that, when executed, cause the processor to:
after a data access request is received, determining a target Uniform Resource Locator (URL) of the request;
determining the similarity between the target URL and a predetermined target illegal URL, wherein the target illegal URL is a part or all of illegal URLs, and the illegal URLs are URLs which are requested to be accessed by illegal users;
and if the similarity meets a first preset condition, determining that the data access request is an illegal request.
A computer-readable storage medium storing one or more programs that, when executed by an electronic device including a plurality of application programs, cause the electronic device to:
after a data access request is received, determining a target Uniform Resource Locator (URL) of the request;
determining the similarity between the target URL and a predetermined target illegal URL, wherein the target illegal URL is a part or all of illegal URLs, and the illegal URLs are URLs which are requested to be accessed by illegal users;
and if the similarity meets a first preset condition, determining that the data access request is an illegal request.
The embodiment of the application adopts at least one technical scheme which can achieve the following beneficial effects:
after receiving the data access request, determining a target URL in the request, then determining the similarity between the target URL and a target illegal URL, and if the similarity is greater than a set similarity threshold, determining that the data access request is an illegal request. Because the target illegal URL is a URL requested to be accessed by the illegal user, if the similarity between the URL in the newly received request and the target illegal URL is greater than the set similarity threshold, the newly received request can be considered as an illegal request. Therefore, the access request is not identified according to the IP address, so that even if an illegal user changes the IP address, the identification of the illegal request by the scheme is not influenced, and the accuracy of identifying the illegal request is improved compared with the prior art.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
fig. 1 is a schematic flow chart illustrating an implementation of an illegal request identification method according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of an illegal request identification device according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following specific embodiments of the present application and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The technical solutions provided by the embodiments of the present application are described in detail below with reference to the accompanying drawings.
As described in the background art, an internet website is used to provide information to normal users, however, when an illegal user simulates a normal user to visit a website through a machine, a large amount of core data of the website is captured, and the access amount of a core interface of the website is large. Therefore, an anti-skimming system is required to identify the illegal user and prevent the illegal request of the illegal user.
The anti-trap system can analyze the user access log over a period of time in the past to identify illegal IPs in the dimension of the IPs. Taking the time period as 10 minutes as an example, collecting access log data within 10 minutes by using a queue data structure, wherein each IP corresponds to a queue; and then counting the access amount of each IP every 10 minutes, if the access times of some IPs exceed a certain threshold value, carrying out forbidden processing on the IPs, and then emptying the queue corresponding to each IP. Thereby preventing illegal requests from illegal users.
However, by judging the number of accesses initiated by the IP within a past period of time, if a lawbreaker adds other IP addresses, the blocking of the anti-blocking system can be bypassed, which results in the prior art not being able to accurately identify illegal requests.
Lawbreakers use multiple IP addresses to initiate requests, mainly two cases:
firstly, a lawless person writes a station-swiping request script on a computer of the lawless person, and the IP address of the computer is actively replaced after the script requests for a certain number of times or a certain time.
Second, lawbreakers deploy their own flashing scripts on some cloud server products because the IP addresses of cloud servers are dynamic, resulting in flashing through multiple IPs.
The inventor analyzes the prior art and finds that the common characteristics of the two cases are as follows: since the requests come from the same script, although the IP addresses are different, the URLs are substantially similar and are different from the requests of normal users. Based on the above, the embodiment of the application provides an illegal request identification method based on the similarity of the URL, so as to solve the problem of low accuracy of illegal request identification in the prior art.
The execution main body of the illegal request identification method provided by the embodiment of the application can be an anti-capture system. For convenience of description, the following description will be made of an embodiment of the method taking an implementation subject of the method as an anti-grabbing station system as an example. It is to be understood that the implementation of the method by the anti-grab station system is merely an exemplary illustration and should not be construed as a limitation of the method.
The illegal request identification method in one or more embodiments of the present application is described in detail below. The implementation flow diagram of the method is shown in fig. 1, and comprises the following steps:
step 11, after receiving a data access request, determining a target Uniform Resource Locator (URL) of the request;
a Uniform Resource Locator (URL), which can represent the location and access method of a Resource on the internet and can be used as an address of the Resource on the internet.
The URL may be composed of multiple parts, a typical URL, whose format may be expressed as: "protocol type [// server address [: port number ] ] [/resource level UNIX file path ] file name [? Query ] [ # fragment ID ", although the format of part of this is not essential, for example, URL" http:// abc.com/useru ═ 1 ", http" is protocol type, "abc.com" is domain name, can be server address, "/user" is interface, and "u ═ 1" is parameter.
When a user initiates a data access request to a server through a client, the data access request includes a URL, where the URL is an address of data to be accessed, and then, after receiving the data access request, the URL in the request may be determined, so as to be easily distinguished from other URLs in the following text, and is referred to as a target URL.
Step 12, determining the similarity between the target URL and a predetermined target illegal URL, wherein the target illegal URL is a URL which an illegal user requests to access;
the illegal URL is a URL that the illegal user requests to access, and the target illegal URL may be a part or all of the illegal URLs, such as one or more of the illegal URLs. The target illegal URL may be a URL that is representative of the illegal URLs, i.e., the characteristics of the target illegal URL may represent a plurality of illegal URLs having common characteristics of the plurality of illegal URLs. The specific method for determining the target illegal URL is described in detail later, and is not described herein.
For the illegal URL, it can be obtained in various ways, which is not limited in this embodiment. For example, by counting the number of times of access of each IP address in a past period of time, for an IP whose number of times of access is higher than a certain number threshold, it is determined that a user using the IP is an illegal user, and a URL corresponding to the IP is an illegal URL.
And step 13, if the similarity meets a first preset condition, determining that the data access request is an illegal request.
If there is only one target illegal URL in step 12, the first preset condition may be that the similarity is greater than a set first similarity threshold, which may be set empirically, for example, may be 90%; if there is more than one target illegal URL in step 12, the first preset condition may be that the similarity to any one target illegal URL is greater than a set similarity threshold, the similarity to more than one target illegal URL is greater than a set similarity threshold, the average of the similarities to the target illegal URLs is greater than a set threshold, and the like.
If the similarity meets the first preset condition, the target URL may be considered to be similar to the illegal URL, and the data access request in step 11 may be determined to be an illegal request; and if the similarity does not meet the first preset condition, determining that the data access request is not an illegal request.
After determining that the data access request is an illegal request, access to the data access request may be denied.
By the illegal request identification method in one or more embodiments of the application, after a data access request is received, a target URL in the request is determined, then the similarity between the target URL and a target illegal URL is determined, and if the similarity is larger than a set similarity threshold, the data access request is determined to be an illegal request. Because the target illegal URL is a URL requested to be accessed by the illegal user, if the similarity between the URL in the newly received request and the target illegal URL is greater than the set similarity threshold, the newly received request can be considered as an illegal request. Therefore, the access request is not identified according to the IP address, so that even if an illegal user changes the IP address, the identification of the illegal request by the scheme is not influenced, and the accuracy of identifying the illegal request is improved compared with the prior art.
The following describes in detail the determination process of a target illegal URL in one or more embodiments of the present specification:
step 21, obtaining an illegal URL;
for the illegal URL, it can be obtained in various ways, which is not limited in this embodiment. For example, the number of times of access of each IP address in a past period of time may be counted, and for an IP whose number of times of access is higher than a certain number threshold, it may be considered that a user using the IP is an illegal user, and a URL corresponding to the IP is an illegal URL.
Step 22, determining the similarity between illegal URLs;
if there are often a plurality of illegal URLs determined in step 21, then if it is desired to select an illegal URL that can represent a plurality of illegal URLs, that is, an illegal URL having common characteristics of a plurality of illegal URLs, the illegal URL may be determined by determining the similarity between the illegal URLs, and the specific method for determining the similarity is described in detail later and will not be described herein again.
And step 23, taking the illegal URL with the similarity meeting a second preset condition with other illegal URLs as the target illegal URL.
The second preset condition includes at least one of:
the average value of all the similarity between the illegal URL and other illegal URLs is larger than a set average value threshold value;
and the average value of all the similarities between the illegal URL and other illegal URLs is higher than a first preset order in average value sequencing, and the average value sequencing is sequencing according to the sequence of the average value from large to small.
Specifically, for each illegal URL, the similarity between the URL and each of other illegal URLs may be calculated, and then the average value of the similarities is calculated, that is, the average value of the similarities between each of the illegal URLs and each of the other illegal URLs may be obtained. If the average value corresponding to a certain illegal URL is larger than the set average value threshold value, the illegal URL can be used as the target illegal URL.
In addition, the average values corresponding to the illegal URLs may be sorted to obtain an average value sorting, and if the average value sorting corresponding to a certain illegal URL is higher than the first preset order, the illegal URL may be used as the target illegal URL. For example, if the average value of an illegal URL ranks highest, the illegal URL may be the target illegal URL.
In one or more embodiments of the present disclosure, the step 23 may also be implemented as follows:
determining the target similarity meeting a third preset condition in the similarity among the illegal URLs; and taking the illegal URL corresponding to the target similarity as a target illegal URL. Because the target similarity is calculated by two illegal URLs, the illegal URL corresponding to the target similarity is the two illegal URLs with the target similarity calculated.
The third preset condition comprises at least one of the following conditions:
the similarity is greater than a set similarity threshold, which may be set according to manual experience, such as 90%;
the sequence of the similarity in the similarity sequence is higher than the second preset sequence, and the similarity sequence is a sequence from large to small in similarity.
The similarity between the illegal URLs can be calculated pairwise to obtain a plurality of similarities, the obtained similarities are then sorted, and if the second preset order is, for example, the 6 th order, the similarity ranking 5 th order can be taken.
In addition, because the similarity is obtained by calculating two URLs, that is, the same similarity corresponds to two URLs, one of the two URLs can be optionally selected as a target illegal URL, and the two URLs can be both selected as the target illegal URL.
Then, the illegal URL whose similarity with other illegal URLs satisfies the second preset condition can be used as the target illegal URL.
Through the above steps 21-23, the target illegal URL is a URL with higher similarity to other illegal URLs, not all illegal URLs, so that after receiving the data access request subsequently, whether the data access request is an illegal request can be judged more efficiently and quickly. In addition, because the target illegal URL is a URL with higher similarity with other illegal URLs, the target illegal URL has the common characteristics of most illegal URLs, and the accuracy of illegal request identification cannot be influenced.
It should be noted that if the number of illegal URLs is small and the target illegal URL cannot be obtained by calculating the similarity, all the illegal URLs can be used as the target illegal URL.
In one or more embodiments of the present disclosure, in order to further improve the accuracy of identifying an illegal request, the illegal requests may be classified according to the dimensions of the access interfaces, and different access interfaces correspond to different target illegal URLs, so that after receiving a data access request, a target illegal URL corresponding to the data access request may be obtained according to the interface of the URL corresponding to the data access request to determine whether the data access request is illegal.
Before describing in detail how to identify an illegal URL according to the access interface dimension, the following describes in detail the process of determining a target illegal URL according to the access interface dimension.
After the illegal URL is obtained, the illegal URLs can be classified according to the dimensionality of the access interface, and then the target illegal URL in the illegal URLs belonging to the same category is determined. Specifically, during the determination, the similarity between illegal URLs belonging to the same category can be determined, and then the illegal URL of which the similarity with other illegal URLs in the illegal URLs belonging to the same category meets a second preset condition is used as a target illegal URL; or determining the target similarity meeting a third preset condition in the similarity among illegal URLs belonging to the same category, and then taking the illegal URL corresponding to the target similarity as the target illegal URL.
Through the above process, the direct corresponding relation between the access interface and the target illegal URL can be obtained. If a plurality of access interfaces exist, corresponding target illegal URLs can exist under each access interface, namely the mapping relation between each target illegal URL and the access interface is established.
Specifically, the URL requested by the unauthorized IP may be extracted from the access log of the unauthorized IP to form the list L. Then, the URLs in the L are classified according to the dimension of the access interface to obtain a set S { L1, L2, L3 … Li, … Ln }, wherein Li represents a URL list belonging to a certain interface. The logs in Li can be identified as: ai1, Ai2, Ai3 … Ai … An, two by two calculate the similarity of the URLs in Li, and find out the target illegal URL meeting the second preset condition: aix, namely, the corresponding relation between the interfaces Li and Aix is established. And finally, obtaining an illegal target URL corresponding to each interface in the set S, namely constructing a mapping f: li- > Aix.
After the mapping relation is constructed, after a data access request is received, a target access interface corresponding to a target URL can be determined, and the access interface can be determined from the target URL; and then determining a target illegal URL corresponding to the target access interface according to the mapping relation between each target illegal URL and the access interface which is constructed in advance.
When the illegal request is identified, the target URL in the illegal request is identified according to the dimension of the interface, and the similarity between the illegal URLs under the same access interface is higher, so that the accuracy of identifying the illegal request is further improved.
In one or more embodiments of the present disclosure, when determining the similarity between two URLs, the similarity between two strings of URLs may be calculated by using the similarity between the two strings, and the similarity between the two strings may be quantified by an edit distance, where the edit distance is a quantitative measure of the difference between the two strings (e.g., english letters), and the measure is how many times at least one process is required to change one string into another string.
The levenstein distance, also called Levenshtein distance, is one of the edit distances, and refers to the minimum number of edit operations required to change from one character string to another. The allowed editing operations include: replacing one character with another; inserting a character; one character is deleted.
In one or more embodiments of the present description, the similarity of two character strings may be calculated by the following formula:
Similarity=(Max(x,y)-Levenshtein)/Max(x,y)
where x, y are the lengths of the source and target strings, Max (x, y) is the maximum length between the source and target strings, and Levenshtein is the Levenshtein distance.
The calculation code of the Levenshtein distance has the following recursive method:
Figure BDA0001927728500000101
Figure BDA0001927728500000111
in the following, a specific implementation process of the illegal request identification provided in this specification is described in detail with reference to a specific application scenario, and reference may be made to the foregoing related description for technical details that are not introduced in the implementation process.
Step 31, obtaining an illegal URL;
now get a collection of illegal IP addresses, assume that the URL list (L) requested by one of the IP addresses is as follows:
abc.com/untype=wifi&d=1001&u=gas
abc.com/untype=wifi&d=1001&u=gms
abc.com/untype=wifi&d=1001&u=gamk
abc.com/untype=wifi&d=1001&u=peas
abc.com/ssntype=wifi&d=1001&u=gas
abc.com/ssntype=wifi&d=1001&u=gap
this is clearly the same script, however there are many IP's that initiate such requests that are not discovered.
Step 32, classifying the URLs in the L according to interfaces to form a set S, numbering the URLs in the list S as follows: 1, 2, 3, 4, 5, 6;
the resulting S can be expressed as:
Figure BDA0001927728500000112
Figure BDA0001927728500000121
wherein, "/u" and "/ss" are access interfaces.
Step 33, calculating the similarity of URLs belonging to the same interface;
for L1, the logs 1, 2, 3 and 4 are taken out, and the similarity is calculated pairwise.
Step 34, determining the log 1 with the highest similarity to the rest URLs, and establishing a mapping of "/u" - > 1.
This completes the map building process. The established mapping will then be used to identify illegitimate requests.
In the online sealing system, if receiving an access of a normal user to the "/u" interface, the form is as follows:
abc.com/untype=3g&m=dc9a&info=succ&d=1001&u=gas
abc.com/uid=dds&targetnum=22271&u=wdd
at this time, the URL is analyzed to obtain the visited interface "/u", the mapping table is queried to obtain "/u" - >1, then the similarity between the URL and the URL No. 1 is calculated to obtain the similarity which is approximately equal to 0, and the request is not blocked by the mapping.
If the script of the lawless person sends out the request in other IP as follows:
abc.com/untype=wifi&d=1001&u=ads
if the similarity calculation result with the log 1 exceeds 90%, the access is blocked.
It should be noted that, for the previously established mapping "/u" - >1, in practical applications, it is likely that each interface has a different lawless party in the brushing station, so that this interface may be mapped with multiple URLs, such as "/u" - >1, 23, 882.
Regarding the calculation of the similarity of character strings, the similarity calculation method described above may be used, or the calculation may be performed by an external program, which is simpler.
For example, only the following code is needed for computing the Levenshtein distance for URL1 and URL2 in Python:
import Levenshtein
L=Levenshtein.distance(URL1,URL2)
the calculation process is simple, convenient and quick.
In the above specific implementation process, the access request is not identified according to the IP address, but is identified according to the similarity of the URL, so that even if an illegal user changes the IP address, the identification of the illegal request by the present scheme is not affected, and the accuracy of identifying the illegal request is improved compared with the prior art.
Based on the illegal request identification method provided by the embodiment of the present application, the embodiment of the present application further provides a corresponding illegal request identification device, as shown in fig. 2, the device mainly includes the following functional units:
a target URL determination unit 41 that, upon receiving a data access request, determines a target uniform resource locator URL of the request;
a first similarity determining unit 42, configured to determine similarity between the target URL and a predetermined target illegal URL, where the target illegal URL is a part or all of illegal URLs, and the illegal URL is a URL requested to be accessed by an illegal user;
the illegal request determining unit 43 determines that the data access request is an illegal request if the similarity satisfies a first preset condition.
In the embodiments of the present application, there are many specific implementations of an illegal request identification apparatus, and in one implementation, the apparatus further includes:
the acquisition unit is used for acquiring an illegal URL (uniform resource locator), wherein the illegal URL is a URL which is requested to be accessed by an illegal user;
a second similarity determining unit that determines a similarity between the illegal URLs;
a first target illegal URL determining unit, which takes an illegal URL with similarity to other illegal URLs meeting a second preset condition as the target illegal URL;
the second preset condition comprises at least one of the following conditions:
the average value of all the similarity between the illegal URL and other illegal URLs is larger than a set average value threshold value;
and the average value of all the similarities between the illegal URL and other illegal URLs is higher than a first preset order in average value sequencing, and the average value sequencing is sequencing according to the sequence of the average value from large to small.
In one embodiment, the apparatus further comprises:
an acquisition unit that acquires an illegal URL;
a second similarity determining unit that determines a similarity between the illegal URLs;
the target similarity determining unit is used for determining the target similarity meeting a third preset condition in the similarity among the illegal URLs;
a third target illegal URL determining unit, which takes the illegal URL corresponding to the target similarity as a target illegal URL;
and the third preset condition comprises at least one of the following conditions:
the similarity is greater than a set similarity threshold;
the sequence of the similarity in the similarity sequence is higher than the second preset sequence, and the similarity sequence is a sequence from large to small in similarity.
In one embodiment, the apparatus further comprises:
the classification unit is used for classifying the illegal URL according to the dimensionality of the access interface;
then, the second similarity determination unit is specifically configured to:
determining the similarity between illegal URLs belonging to the same category;
then, the first target illegal URL determining unit is specifically configured to:
and taking the illegal URL with the similarity meeting a second preset condition with other illegal URLs in the illegal URLs belonging to the same category as a target illegal URL.
In one embodiment, the apparatus further comprises:
a target access interface determining unit which determines a target access interface corresponding to the target URL;
and the second target illegal URL determining unit determines a target illegal URL corresponding to the target access interface according to the set mapping relation between the target illegal URL and the access interface.
The illegal request identification device provided by the embodiment of the application can determine the target URL in the request after receiving the data access request, then determine the similarity between the target URL and the target illegal URL, and determine that the data access request is an illegal request if the similarity is greater than a set similarity threshold. Because the target illegal URL is a URL requested to be accessed by the illegal user, if the similarity between the URL in the newly received request and the target illegal URL is greater than the set similarity threshold, the newly received request can be considered as an illegal request. Therefore, the access request is not identified according to the IP address, so that even if an illegal user changes the IP address, the identification of the illegal request by the scheme is not influenced, and the accuracy of identifying the illegal request is improved compared with the prior art.
Fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present application. Referring to fig. 3, at a hardware level, the electronic device includes a processor, and optionally further includes an internal bus, a network interface, and a memory. The Memory may include a Memory, such as a Random-Access Memory (RAM), and may further include a non-volatile Memory, such as at least 1 disk Memory. Of course, the electronic device may also include hardware required for other services.
The processor, the network interface, and the memory may be connected to each other via an internal bus, which may be an ISA (Industry Standard Architecture) bus, a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 3, but this does not indicate only one bus or one type of bus.
And the memory is used for storing programs. In particular, the program may include program code comprising computer operating instructions. The memory may include both memory and non-volatile storage and provides instructions and data to the processor.
The processor reads the corresponding computer program from the nonvolatile memory into the memory and then runs the computer program to form the application illegal request recognition device on a logic level. The processor is used for executing the program stored in the memory and is specifically used for executing the following operations:
after a data access request is received, determining a target Uniform Resource Locator (URL) of the request;
determining the similarity between the target URL and a predetermined target illegal URL, wherein the target illegal URL is a part or all of illegal URLs, and the illegal URLs are URLs which are requested to be accessed by illegal users;
and if the similarity meets a first preset condition, determining that the data access request is an illegal request.
The method performed by the illegal request recognition device according to the embodiment shown in fig. 2 of the present application may be applied to or implemented by a processor. The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software. The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor.
The electronic device may further execute the method executed by the illegal request recognition device in fig. 2, and implement the function of the illegal request recognition device in the embodiment shown in fig. 2, which is not described herein again in this embodiment of the present application.
An embodiment of the present application further provides a computer-readable storage medium storing one or more programs, where the one or more programs include instructions, which, when executed by an electronic device including a plurality of application programs, enable the electronic device to perform the method performed by the illegal request recognition apparatus in the embodiment shown in fig. 2, and are specifically configured to perform:
after a data access request is received, determining a target Uniform Resource Locator (URL) of the request;
determining the similarity between the target URL and a predetermined target illegal URL, wherein the target illegal URL is a part or all of illegal URLs, and the illegal URLs are URLs which are requested to be accessed by illegal users;
and if the similarity meets a first preset condition, determining that the data access request is an illegal request.
The electronic device and the computer-readable storage medium provided by the embodiment of the application determine a target URL in a data access request after receiving the request, then determine similarity between the target URL and a target illegal URL, and determine that the data access request is an illegal request if the similarity is greater than a set similarity threshold. Because the target illegal URL is a URL requested to be accessed by the illegal user, if the similarity between the URL in the newly received request and the target illegal URL is greater than the set similarity threshold, the newly received request can be considered as an illegal request. Therefore, the access request is not identified according to the IP address, so that even if an illegal user changes the IP address, the identification of the illegal request by the scheme is not influenced, and the accuracy of identifying the illegal request is improved compared with the prior art.
In the 90 s of the 20 th century, improvements in a technology could clearly distinguish between improvements in hardware (e.g., improvements in circuit structures such as diodes, transistors, switches, etc.) and improvements in software (improvements in process flow). However, as technology advances, many of today's process flow improvements have been seen as direct improvements in hardware circuit architecture. Designers almost always obtain the corresponding hardware circuit structure by programming an improved method flow into the hardware circuit. Thus, it cannot be said that an improvement in the process flow cannot be realized by hardware physical modules. For example, a Programmable Logic Device (PLD), such as a Field Programmable Gate Array (FPGA), is an integrated circuit whose Logic functions are determined by programming the Device by a user. A digital system is "integrated" on a PLD by the designer's own programming without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Furthermore, nowadays, instead of manually making an Integrated Circuit chip, such Programming is often implemented by "logic compiler" software, which is similar to a software compiler used in program development and writing, but the original code before compiling is also written by a specific Programming Language, which is called Hardware Description Language (HDL), and HDL is not only one but many, such as abel (advanced Boolean Expression Language), ahdl (alternate Hardware Description Language), traffic, pl (core universal Programming Language), HDCal (jhdware Description Language), lang, Lola, HDL, laspam, hardward Description Language (vhr Description Language), vhal (Hardware Description Language), and vhigh-Language, which are currently used in most common. It will also be apparent to those skilled in the art that hardware circuitry that implements the logical method flows can be readily obtained by merely slightly programming the method flows into an integrated circuit using the hardware description languages described above.
The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer-readable medium storing computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, and an embedded microcontroller, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic for the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller as pure computer readable program code, the same functionality can be implemented by logically programming method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a controller may thus be considered a hardware component, and the means included therein for performing the various functions may also be considered as a structure within the hardware component. Or even means for performing the functions may be regarded as being both a software module for performing the method and a structure within a hardware component.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (8)

1. An illegal request identification method, comprising:
after a data access request is received, determining a target Uniform Resource Locator (URL) of the request;
determining the similarity between the target URL and a predetermined target illegal URL, wherein the target illegal URL is a part or all of illegal URLs, and the illegal URLs are URLs which are requested to be accessed by illegal users;
if the similarity meets a first preset condition, determining that the data access request is an illegal request;
wherein the target illegal URL is determined by:
obtaining an illegal URL;
determining the similarity between illegal URLs;
taking the illegal URL with the similarity meeting a second preset condition with other illegal URLs as the target illegal URL; or the like, or, alternatively,
taking the illegal URL with the similarity meeting a third preset condition with other illegal URLs as the target illegal URL;
and the second preset condition comprises at least one of the following conditions:
the average value of all the similarity between the illegal URL and other illegal URLs is larger than a set average value threshold value;
the average value of all similarities between the illegal URL and other illegal URLs is higher than a first preset order in average value sequencing, and the average value sequencing is sequencing according to the sequence of the average value from big to small;
and the third preset condition comprises at least one of the following conditions:
the similarity between the illegal URL and other illegal URLs is larger than a set similarity threshold;
the sequence of the similarity between the illegal URL and other illegal URLs in the similarity sequence is higher than the second preset sequence, and the similarity sequence is a sequence from the big to the small of the similarity.
2. The method of claim 1, wherein after obtaining the illegal URL to which the illegal user requests access, the method further comprises:
classifying the illegal URL according to the dimensionality of an access interface;
then, determining the similarity between the illegal URLs specifically includes:
determining the similarity between illegal URLs belonging to the same category;
then, taking the illegal URL whose similarity with other illegal URLs meets a second preset condition as the target illegal URL, specifically including:
and taking the illegal URL with the similarity meeting a second preset condition with other illegal URLs in the illegal URLs belonging to the same category as a target illegal URL.
3. The method of claim 1, wherein prior to determining the similarity of the target URL to a predetermined target illegitimate URL, the method further comprises:
determining a target access interface corresponding to the target URL;
and determining a target illegal URL corresponding to the target access interface according to the set mapping relation between the target illegal URL and the access interface.
4. An illegal request recognition device, comprising:
the target URL determining unit is used for determining a target Uniform Resource Locator (URL) of a request after receiving a data access request;
the first similarity determining unit is used for determining the similarity between the target URL and a predetermined target illegal URL, wherein the target illegal URL is a part or all of illegal URLs, and the illegal URLs are URLs which are requested to be accessed by illegal users;
an illegal request determining unit, configured to determine that the data access request is an illegal request if the similarity satisfies a first preset condition;
the device further comprises:
an acquisition unit that acquires an illegal URL;
a second similarity determining unit that determines a similarity between the illegal URLs;
a first target illegal URL determining unit, which takes an illegal URL with similarity to other illegal URLs meeting a second preset condition as the target illegal URL; or the like, or, alternatively,
a third target illegal URL determining unit, which takes an illegal URL with similarity to other illegal URLs meeting a third preset condition as the target illegal URL;
and the second preset condition comprises at least one of the following conditions:
the average value of all the similarity between the illegal URL and other illegal URLs is larger than a set average value threshold value;
the average value of all similarities between the illegal URL and other illegal URLs is higher than a first preset order in average value sequencing, and the average value sequencing is sequencing according to the sequence of the average value from big to small;
and the third preset condition comprises at least one of the following conditions:
the similarity between the illegal URL and other illegal URLs is larger than a set similarity threshold;
the sequence of the similarity between the illegal URL and other illegal URLs in the similarity sequence is higher than the second preset sequence, and the similarity sequence is a sequence from the big to the small of the similarity.
5. The apparatus of claim 4, wherein the apparatus further comprises:
the classification unit is used for classifying the illegal URL according to the dimensionality of the access interface;
then, the second similarity determination unit is specifically configured to:
determining the similarity between illegal URLs belonging to the same category;
then, the first target illegal URL determining unit is specifically configured to:
and taking the illegal URL with the similarity meeting a second preset condition with other illegal URLs in the illegal URLs belonging to the same category as a target illegal URL.
6. The apparatus of claim 4, wherein the apparatus further comprises:
a target access interface determining unit which determines a target access interface corresponding to the target URL;
and the second target illegal URL determining unit determines a target illegal URL corresponding to the target access interface according to the set mapping relation between the target illegal URL and the access interface.
7. An electronic device, comprising:
a processor; and
a memory arranged to store computer executable instructions that, when executed, cause the processor to:
after a data access request is received, determining a target Uniform Resource Locator (URL) of the request;
determining the similarity between the target URL and a predetermined target illegal URL, wherein the target illegal URL is a part or all of illegal URLs, and the illegal URLs are URLs which are requested to be accessed by illegal users;
if the similarity meets a first preset condition, determining that the data access request is an illegal request;
wherein the target illegal URL is determined by:
obtaining an illegal URL;
determining the similarity between illegal URLs;
taking the illegal URL with the similarity meeting a second preset condition with other illegal URLs as the target illegal URL; or the like, or, alternatively,
taking the illegal URL with the similarity meeting a third preset condition with other illegal URLs as the target illegal URL;
and the second preset condition comprises at least one of the following conditions:
the average value of all the similarity between the illegal URL and other illegal URLs is larger than a set average value threshold value;
the average value of all similarities between the illegal URL and other illegal URLs is higher than a first preset order in average value sequencing, and the average value sequencing is sequencing according to the sequence of the average value from big to small;
and the third preset condition comprises at least one of the following conditions:
the similarity between the illegal URL and other illegal URLs is larger than a set similarity threshold;
the sequence of the similarity between the illegal URL and other illegal URLs in the similarity sequence is higher than the second preset sequence, and the similarity sequence is a sequence from the big to the small of the similarity.
8. A computer-readable storage medium storing one or more programs that, when executed by an electronic device including a plurality of application programs, cause the electronic device to:
after a data access request is received, determining a target Uniform Resource Locator (URL) of the request;
determining the similarity between the target URL and a predetermined target illegal URL, wherein the target illegal URL is a part or all of illegal URLs, and the illegal URLs are URLs which are requested to be accessed by illegal users;
if the similarity meets a first preset condition, determining that the data access request is an illegal request;
wherein the target illegal URL is determined by:
obtaining an illegal URL;
determining the similarity between illegal URLs;
taking the illegal URL with the similarity meeting a second preset condition with other illegal URLs as the target illegal URL; or the like, or, alternatively,
taking the illegal URL with the similarity meeting a third preset condition with other illegal URLs as the target illegal URL;
and the second preset condition comprises at least one of the following conditions:
the average value of all the similarity between the illegal URL and other illegal URLs is larger than a set average value threshold value;
the average value of all similarities between the illegal URL and other illegal URLs is higher than a first preset order in average value sequencing, and the average value sequencing is sequencing according to the sequence of the average value from big to small;
and the third preset condition comprises at least one of the following conditions:
the similarity between the illegal URL and other illegal URLs is larger than a set similarity threshold;
the sequence of the similarity between the illegal URL and other illegal URLs in the similarity sequence is higher than the second preset sequence, and the similarity sequence is a sequence from the big to the small of the similarity.
CN201811624702.1A 2018-12-28 2018-12-28 Illegal request identification method and device and electronic equipment Active CN109743309B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811624702.1A CN109743309B (en) 2018-12-28 2018-12-28 Illegal request identification method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811624702.1A CN109743309B (en) 2018-12-28 2018-12-28 Illegal request identification method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN109743309A CN109743309A (en) 2019-05-10
CN109743309B true CN109743309B (en) 2021-09-10

Family

ID=66361908

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811624702.1A Active CN109743309B (en) 2018-12-28 2018-12-28 Illegal request identification method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN109743309B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110381017A (en) * 2019-06-12 2019-10-25 微梦创科网络科技(中国)有限公司 A kind of illegal request recognition methods and device
CN110460587B (en) * 2019-07-23 2022-01-25 平安科技(深圳)有限公司 Abnormal account detection method and device and computer storage medium
CN111091019B (en) * 2019-12-23 2024-03-01 支付宝(杭州)信息技术有限公司 Information prompting method, device and equipment
CN111865998A (en) * 2020-07-24 2020-10-30 广西科技大学 Network security zone login method and device

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101035128A (en) * 2007-04-18 2007-09-12 大连理工大学 Three-folded webpage text content recognition and filtering method based on the Chinese punctuation
CN102469117A (en) * 2010-11-08 2012-05-23 中国移动通信集团广东有限公司 Method and device for identifying abnormal access action
CN102647408A (en) * 2012-02-27 2012-08-22 珠海市君天电子科技有限公司 Method for judging phishing website based on content analysis
CN103685294A (en) * 2013-12-20 2014-03-26 北京奇虎科技有限公司 Method and device for identifying attack sources of denial of service attack
CN105046124A (en) * 2015-07-31 2015-11-11 小米科技有限责任公司 Security protection method and apparatus
CN106055574A (en) * 2016-05-19 2016-10-26 微梦创科网络科技(中国)有限公司 Method and device for recognizing illegal URL
CN106302534A (en) * 2016-09-30 2017-01-04 微梦创科网络科技(中国)有限公司 A kind of detection and the method and system of process disabled user
CN106713371A (en) * 2016-12-08 2017-05-24 中国电子科技网络信息安全有限公司 Fast Flux botnet detection method based on DNS anomaly mining
CN107046544A (en) * 2017-05-02 2017-08-15 深圳乐信软件技术有限公司 A kind of method and apparatus of the unauthorized access request recognized to website
CN107277014A (en) * 2017-06-20 2017-10-20 黄河科技学院 A kind of identifying device for detecting invalid access to computer network
CN107426199A (en) * 2017-07-05 2017-12-01 浙江鹏信信息科技股份有限公司 A kind of method and system of Network anomalous behaviors detection and analysis
CN107465648A (en) * 2016-06-06 2017-12-12 腾讯科技(深圳)有限公司 The recognition methods of warping apparatus and device
CN107872452A (en) * 2017-10-25 2018-04-03 东软集团股份有限公司 A kind of recognition methods of malicious websites, device, storage medium and program product
CN108965251A (en) * 2018-06-08 2018-12-07 广州大学 A kind of safe mobile phone guard system that cloud combines

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101035128A (en) * 2007-04-18 2007-09-12 大连理工大学 Three-folded webpage text content recognition and filtering method based on the Chinese punctuation
CN102469117A (en) * 2010-11-08 2012-05-23 中国移动通信集团广东有限公司 Method and device for identifying abnormal access action
CN102647408A (en) * 2012-02-27 2012-08-22 珠海市君天电子科技有限公司 Method for judging phishing website based on content analysis
CN103685294A (en) * 2013-12-20 2014-03-26 北京奇虎科技有限公司 Method and device for identifying attack sources of denial of service attack
CN105046124A (en) * 2015-07-31 2015-11-11 小米科技有限责任公司 Security protection method and apparatus
CN106055574A (en) * 2016-05-19 2016-10-26 微梦创科网络科技(中国)有限公司 Method and device for recognizing illegal URL
CN107465648A (en) * 2016-06-06 2017-12-12 腾讯科技(深圳)有限公司 The recognition methods of warping apparatus and device
CN106302534A (en) * 2016-09-30 2017-01-04 微梦创科网络科技(中国)有限公司 A kind of detection and the method and system of process disabled user
CN106713371A (en) * 2016-12-08 2017-05-24 中国电子科技网络信息安全有限公司 Fast Flux botnet detection method based on DNS anomaly mining
CN107046544A (en) * 2017-05-02 2017-08-15 深圳乐信软件技术有限公司 A kind of method and apparatus of the unauthorized access request recognized to website
CN107277014A (en) * 2017-06-20 2017-10-20 黄河科技学院 A kind of identifying device for detecting invalid access to computer network
CN107426199A (en) * 2017-07-05 2017-12-01 浙江鹏信信息科技股份有限公司 A kind of method and system of Network anomalous behaviors detection and analysis
CN107872452A (en) * 2017-10-25 2018-04-03 东软集团股份有限公司 A kind of recognition methods of malicious websites, device, storage medium and program product
CN108965251A (en) * 2018-06-08 2018-12-07 广州大学 A kind of safe mobile phone guard system that cloud combines

Also Published As

Publication number Publication date
CN109743309A (en) 2019-05-10

Similar Documents

Publication Publication Date Title
CN109743309B (en) Illegal request identification method and device and electronic equipment
CN106055574B (en) Method and device for identifying illegal uniform resource identifier (URL)
CN106789831B (en) Method and device for identifying network attack
CN108304410B (en) Method and device for detecting abnormal access page and data analysis method
CN109246064B (en) Method, device and equipment for generating security access control and network access rule
CN108449316B (en) Anti-crawler method, server and client
KR101530941B1 (en) Method, system and client terminal for detection of phishing websites
WO2013078307A1 (en) Image searching
CN109298987B (en) Method and device for detecting running state of web crawler
CN109450969B (en) Method and device for acquiring data from third-party data source server and server
CN114900546B (en) Data processing method, device and equipment and readable storage medium
WO2017063596A1 (en) Method, apparatus and device for processing sitemap
US11797617B2 (en) Method and apparatus for collecting information regarding dark web
CN107688563B (en) Synonym recognition method and recognition device
CN110619075A (en) Webpage identification method and equipment
CN110929129B (en) Information detection method, equipment and machine-readable storage medium
CN110717036B (en) Method and device for removing duplication of uniform resource locator and electronic equipment
CN108830298B (en) Method and device for determining user feature tag
CN110598115A (en) Sensitive webpage identification method and system based on artificial intelligence multi-engine
CN108268775B (en) Web vulnerability detection method and device, electronic equipment and storage medium
CN110019295B (en) Database retrieval method, device, system and storage medium
CN114710318A (en) Method, device, equipment and medium for limiting high-frequency access of crawler
CN111625721B (en) Content recommendation method and device
CN110825976B (en) Website page detection method and device, electronic equipment and medium
CN110018844B (en) Management method and device of decision triggering scheme and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant