CN113905092B - Method, device, terminal and storage medium for determining reusable agent queue - Google Patents
Method, device, terminal and storage medium for determining reusable agent queue Download PDFInfo
- Publication number
- CN113905092B CN113905092B CN202111142651.0A CN202111142651A CN113905092B CN 113905092 B CN113905092 B CN 113905092B CN 202111142651 A CN202111142651 A CN 202111142651A CN 113905092 B CN113905092 B CN 113905092B
- Authority
- CN
- China
- Prior art keywords
- agent
- environment
- current
- target website
- queue
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 37
- 238000012545 processing Methods 0.000 claims abstract description 15
- 230000004044 response Effects 0.000 claims description 53
- 238000004590 computer program Methods 0.000 claims description 7
- 238000001514 detection method Methods 0.000 claims description 7
- 230000002159 abnormal effect Effects 0.000 claims description 6
- 238000012423 maintenance Methods 0.000 claims description 6
- 230000008569 process Effects 0.000 description 6
- 238000010276 construction Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000001681 protective effect Effects 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 239000002131 composite material Substances 0.000 description 1
- 238000004883 computer application Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000000802 evaporation-induced self-assembly Methods 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000007723 transport mechanism Effects 0.000 description 1
Landscapes
- Computer And Data Communications (AREA)
Abstract
The invention discloses a method for determining a reusable agent queue, and a device, a terminal and a storage medium corresponding to the method. The method for determining the reusable proxy queue is characterized in that a proxy environment is distributed to a target website which a client wants to access, whether the proxy environment is suitable for the target website is determined according to the access result of the proxy environment, if so, the proxy environment is stored in the reusable proxy queue corresponding to the target website, other clients can conveniently access the target website, and the processing efficiency is improved.
Description
Technical Field
The present invention relates to the field of computer application design, and in particular, to a method, an apparatus, a terminal, and a storage medium for determining a reusable agent queue.
Background
Any network is generally constructed by taking protective measures, and one of the most common protective measures is to limit the access frequency of each IP, that is, the information acquired by one IP on the target website in a unit time is very limited.
However, when collecting network data, the protection mechanism of the target network may cause low information acquisition efficiency, and it is difficult to comprehensively collect information on the target website.
To solve such a problem occurring in the process of collecting network data, the prior art generally adopts an IP proxy pool method or a random tunnel proxy method. However, in the existing scheme, a series of problems such as low management efficiency, unsatisfactory connection speed, unsafe threads and the like often exist.
Disclosure of Invention
The present invention aims to solve at least one of the technical problems existing in the prior art. To this end, the present invention proposes a method for determining a reusable proxy queue, which can provide a solution with high concurrency controllability and rapid processing.
The invention also provides equipment, a terminal and a storage medium for determining the reusable agent queue.
An embodiment of a method for determining a reusable agent queue according to the first aspect of the present invention is characterized by comprising the steps of:
acquiring an access request from a client;
determining a current agent environment applicable to the access request;
sending the access request to a target website based on the current agent environment, and judging whether the current agent environment is suitable for the target website according to a response message of the target website;
and if the current proxy environment is suitable for the target website, storing the current proxy environment into a reusable proxy environment queue corresponding to the target website.
The method for determining the reusable agent queue according to the embodiment of the invention has at least the following beneficial effects: after the agent environment is distributed to the clients and the request is accessed, a corresponding reusable agent environment queue can be established for each target website according to the response message of the target website, so that the clients which subsequently access the target website can be ensured to select the processed agent environment which can be directly used from the reusable agent queue, and the access efficiency is improved.
According to some embodiments of the invention, the step of determining a current agent context applicable to the access request comprises:
determining a target website to be accessed by the access request, and acquiring a reusable agent environment queue corresponding to the target website;
if the reusable agent environment queue has available agent environments, selecting an agent environment from the reusable agent environment queue as a current agent environment;
and if the available agent environment does not exist in the reusable agent environment queue, selecting an agent which is not in a blacklist or a gray list from the available agent environment queue as the current agent environment, wherein the blacklist records an agent which is not applicable to the target website, and the gray list records an agent which is being used.
According to some embodiments of the invention, the step of selecting an agent from the available agent queue that is not in the black list or the gray list as the current agent environment includes:
selecting agents from the available agent queue that are not among the black list or the gray list;
combining the agent with the website information of the target website, and taking the combined result as the current agent environment corresponding to the target website.
According to some embodiments of the invention, the step of determining the available agent queues comprises:
obtaining a source agent from a third party agent and placing the source agent into an agent pool;
determining a response speed level of each source agent in the agent pool;
the response speed level of each source agent is used as a storage priority, and the source agents are stored in an available agent queue at regular time.
According to some embodiments of the invention, the step of determining whether the current agent environment is applicable to the target website according to the response message of the target website includes:
if the response status code in the response message is an environment failure status code, determining that the current agent environment is not suitable for the target website;
if the connection between the current agent environment and the target website cannot be established, judging that the current agent environment is not suitable for the target website;
if the current agent environment can be connected with the target website, but the response of the target website is received and overtime, judging that the current agent environment is not suitable for the target website;
if the response status code in the response message is a server abnormal status code, determining that the current agent environment is not suitable for the target website;
and otherwise, judging that the current agent environment is suitable for the target website.
According to some embodiments of the invention, after determining whether the current agent environment is suitable for the target website according to the response message of the target website, the method further includes:
and if the current agent environment is not suitable for the target website, detecting the self state of the current agent environment and correspondingly processing the current agent environment according to the detection result.
According to some embodiments of the present invention, if the current proxy environment is not suitable for the target website, the step of detecting the self state of the current proxy environment and correspondingly processing the current proxy environment according to the detection result specifically includes:
accessing a public network server through the current agent environment;
if the current agent environment can access the public network server, not processing;
if the current agent environment cannot establish network connection with a public network server, deleting the current agent environment from an available agent queue and writing the current agent environment into a blacklist;
if the current agent environment successfully establishes network connection with the public network server, but the response of the receiving server is overtime, deleting the current agent environment from the available agent queue and writing the current agent environment into a blacklist;
and if the current agent environment is connected with the public network server in a network mode, the response state code generates an abnormal state code, and the current agent environment is deleted from the available agent queue and written into a blacklist.
An apparatus for determining a reusable proxy queue according to an embodiment of a second aspect of the present application, comprising:
the access request acquisition module can acquire an access request from a client;
the agent environment distribution module can determine the current agent environment applicable to the access request;
the response processing module can send the access request to a target website based on the current agent environment, and judge whether the current agent environment is suitable for the target website according to the response message of the target website;
and the queue maintenance module is used for storing the current agent environment to a reusable agent environment queue corresponding to the target website if the current agent environment is suitable for the target website.
The device for determining the reusable agent queue according to the embodiment of the invention has at least the following beneficial effects: the device can determine whether the current agent environment is suitable for the target website based on the access result of the website after the agent environment is distributed to the client, and then the agent environment which can be suitable for the target website is saved into the reusable agent queue by using the queue maintenance module, so that the access efficiency of the subsequent process is improved.
Further, the agent environment allocation module further includes:
the queue selection element can determine a target website to be accessed by the access request and acquire a reusable agent environment queue corresponding to the target website;
a first proxy environment selection element that selects a proxy environment from the reusable proxy environment queue as a current proxy environment if there is a proxy environment available in the reusable proxy environment queue;
and the second agent environment selecting element selects an agent which is not in a blacklist or a gray list from the available agent environment queue as the current agent environment if the available agent environment does not exist in the reusable agent environment queue, wherein the blacklist records the agent which is not applicable to the target website, and the gray list records the agent which is being used.
Further, the second agent environment selection element further includes:
a list screening unit capable of selecting agents from the available agent queues that are not among the black list or the gray list;
and the environment construction unit can combine the agent with the website information of the target website, and takes the combined result as the current agent environment corresponding to the target website.
Further, the agent environment allocation module further includes an available queue acquisition element, the available queue acquisition element including:
the agent collection unit can acquire a source agent from a third party agent and put the source agent into an agent pool;
a rate detection unit capable of determining a response speed level of each of the source agents in the agent pool;
the available agent queue construction unit is capable of storing the source agents into the available agent queues at regular time using the response speed level of each of the source agents as a storage priority.
An embodiment of the present invention is characterized in that the terminal comprises: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor executing the computer program to implement the method of determining a reusable agent queue described above.
A computer readable storage medium according to an embodiment of the fourth aspect of the present invention is characterized in that the medium stores computer executable instructions for performing a method of determining a reusable agent queue.
Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
The foregoing and/or additional aspects and advantages of the invention will become apparent and may be better understood from the following description of embodiments taken in conjunction with the accompanying drawings in which:
FIG. 1 is a schematic diagram illustrating steps of a method for determining a reusable agent queue according to a first embodiment of the present invention;
FIG. 2 is a diagram illustrating steps of a method for determining a reusable agent queue according to a second embodiment of the present invention;
FIG. 3 is a schematic diagram of the steps for determining the available agent queues in FIG. 2;
FIG. 4 is a block diagram of an apparatus for determining a reusable agent queue in accordance with an embodiment of the present invention.
Detailed Description
Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein the same or similar words designate the same or similar programs or programs having the same or similar functions throughout. The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the invention.
When the crawler program is used for acquiring the data of the public website, the single IP is limited to be used due to the defending mechanism of the website, and the crawler program cannot successfully acquire the data. So that the crawler program can run smoothly, the target website needs to be crawled through proxy IP.
When a large number of agents are used, since the connection speed and state of each agent itself are different, it is necessary to uniformly manage the agents. However, in the existing scheme, the manner of managing a large number of proxy IPs is inefficient, and thread security problems may occur when multiple requests occur simultaneously.
In order to solve the problems in the prior art, a method for determining a reusable agent queue is now proposed.
Embodiment 1,
Referring to fig. 1, the method comprises at least the steps of:
step S100, access requests from clients are acquired.
Step S200, determining a current agent environment applicable to the access request.
And selecting one agent environment from the agent environments to be used as the current access request.
And step S300, based on the current agent environment, sending the access request to a target website, and judging whether the current agent environment is suitable for the target website according to a response message of the target website.
And judging whether the current agent environment is suitable for the target website according to the accessed website response message.
And step 400, if the current agent environment is suitable for the target website, storing the current agent environment into a reusable agent environment queue corresponding to the target website.
If the current agent environment is suitable for the target website, the current agent environment is stored in a reusable agent environment queue, so that the judgment steps can be reduced in the process of subsequently accessing the target website, and the access efficiency is quickened.
Embodiment II,
Referring to fig. 2, a more detailed description is made on the basis of the first embodiment, and the method includes the steps of:
step S100, obtaining an access request from a client
Step S200, determining a current agent environment applicable to the access request. In particular to the preparation method of the composite material,
step 201, determining a target website to be accessed by the access request, and obtaining a reusable agent environment queue corresponding to the target website.
And finding a target website to be accessed from the access request from the client, and finding a reusable agent environment queue corresponding to the target website. It is conceivable that creation is performed if there is no reusable proxy environment queue corresponding to the target web site.
Step S202, if the available agent environment exists in the reusable agent environment queue, selecting the agent environment from the reusable agent environment queue as the current agent environment.
When the reusable agent environment queue of the target website has a usable agent environment, the reusable agent environment queue which can be accessed to the target website is directly selected from the reusable agent queue which can be determined to be accessed to the target website, so that the reliability of access is ensured, and the overall access efficiency is improved.
Step S203, if there is no available agent environment in the reusable agent environment queue, selecting an agent that is not in a blacklist or a gray list from the available agent queue as a current agent environment, wherein the blacklist records an agent that is not applicable to the target website, and the gray list records an agent that is being used.
When the agent environment cannot be obtained from the reusable agent environment queue, it is stated that the reusable agent environment is used up, and a standby mode needs to be provided to obtain the current agent environment, and the process of the mode includes:
a. selecting agents from the available agent queue that are not among the black list or the gray list;
b. combining the agent with the website information of the target website, and taking the combined result as the current agent environment corresponding to the target website.
The above steps are performed by selecting an idle agent and then making it into an agent environment that the target web site can use. Certain accuracy can be sacrificed under the condition of high concurrency, and higher access efficiency is traded.
And step S300, based on the current agent environment, sending the access request to a target website, and judging whether the current agent environment is suitable for the target website according to a response message of the target website.
And after the access is completed, judging the connection state of the current agent environment and the target website according to the response message returned by the target website.
If the response status code in the response message is an environment failure status code, determining that the current agent environment is not suitable for the target website;
if the connection between the current agent environment and the target website cannot be established, judging that the current agent environment is not suitable for the target website;
if the current agent environment can be connected with the target website, but the response of the target website is received and overtime, judging that the current agent environment is not suitable for the target website;
if the response status code in the response message is a server abnormal status code, determining that the current agent environment is not suitable for the target website;
and otherwise, judging that the current agent environment is suitable for the target website.
According to some embodiments of the present application, when it is determined that the current agent environment is not suitable for the present application, it is necessary to determine the reason of the inapplicability, because the reason that the agent cannot access the server may be various, or may be prohibited by the target website, or may be that the agent itself has a problem.
And detecting the connection condition, and accessing the public network server through the current proxy environment.
If the current agent environment can access the public network server, the agent environment is shielded by the target website, wherein the agent environment has no problem;
if the current agent environment cannot establish network link with the public network server, the agent is indicated to die, the agent is deleted from the available agent queue and written into a blacklist;
if the current agent environment successfully establishes connection with the public network server, but the response of the receiving server is overtime, judging that the agent dies, deleting the agent from the available agent queue, and writing a blacklist;
if the current agent environment establishes network connection with the public network server and the response status code has abnormal status code, the agent is judged to die, the agent is deleted from the available agent queue and is written into a blacklist.
It is understood that by a public network server is meant an authoritative server that is accessible most of the time, such as the well known search engine of hundred degrees (http:// www.baidu.com). And when accessing the public network, a plurality of public network servers can be accessed simultaneously, so that the possibility of misjudgment is reduced.
And step 400, if the current agent environment is suitable for the target website, storing the current agent environment into a reusable agent environment queue corresponding to the target website.
After judging that the current agent environment is suitable for the target website, storing the current agent environment into a reusable agent environment queue, so that the current agent environment is convenient to directly use when being called again later, the judging process is reduced, and the access efficiency is improved.
According to some preferred embodiments of the present application, on the basis of embodiment two, a step of determining available proxy queues is provided.
Referring to FIG. 3, there are no sequential steps since it is often done in conjunction with other steps in determining the available proxy queues. Comprising the following steps:
and step A100, acquiring a source agent from a third-party agent and placing the source agent into an agent pool.
Step A200, determining the response speed level of each source agent in the agent pool.
Each source agent in the pool of agents is connected to the public network server to ensure that the agent is available. And obtaining a response time of the source agent to access the public network server.
The response time is less than 1000 milliseconds, and the source agent is judged to be a high-quality agent;
the response time is more than 1000 milliseconds and less than 3000 milliseconds, and the source agent is judged to be a middle-end agent;
and if the response time is more than 3000 milliseconds, judging that the source agent is a bad agent.
The number of seconds set here refers to a value empirically set, and generally speaking, a proxy capable of responding in three seconds is a better connection speed, and often has poor quality more than three seconds.
And step A300, using the response speed level of each source agent as a storage priority, and storing the source agents into an available agent queue at regular time.
According to the response speed level in the step A200, the high-quality agents and the middle-end agents are stored in the available agent queues preferentially, so that the proportion of agents with poor quality can be reduced, and the processing efficiency of the whole method can be further increased.
Referring to fig. 4, a further embodiment of the present application provides an apparatus for determining a reusable proxy queue, where the apparatus 20 includes an access request acquisition module 201, a proxy environment allocation module 202, a response processing module 203, and a queue maintenance module 204.
An access request acquisition module 201 capable of acquiring an access request from a client;
a proxy context allocation module 202 capable of determining a current proxy context applicable to the access request;
the response processing module 203 is configured to send the access request to a target website based on the current proxy environment, and determine whether the current proxy environment is applicable to the target website according to a response message of the target website;
and the queue maintenance module 204 is used for storing the current agent environment to a reusable agent environment queue corresponding to the target website if the current agent environment is suitable for the target website.
The device 20 can determine whether the current agent environment is suitable for the target website based on the access result of the website after the agent environment is allocated to the client, and then store the agent environment which can be suitable for the target website into the reusable agent queue by using the queue maintenance module, so that the access efficiency of the subsequent process is increased.
Further, the agent environment allocation module further includes:
the queue selection element can determine a target website to be accessed by the access request and acquire a reusable agent environment queue corresponding to the target website;
a first proxy environment selection element that selects a proxy environment from the reusable proxy environment queue as a current proxy environment if there is a proxy environment available in the reusable proxy environment queue;
and the second agent environment selecting element selects an agent which is not in a blacklist or a gray list from the available agent environment queue as the current agent environment if the available agent environment does not exist in the reusable agent environment queue, wherein the blacklist records the agent which is not applicable to the target website, and the gray list records the agent which is being used.
Further, the second agent environment selection element further includes:
a list screening unit capable of selecting agents from the available agent queues that are not among the black list or the gray list;
and the environment construction unit can combine the agent with the website information of the target website, and takes the combined result as the current agent environment corresponding to the target website.
Further, the agent environment allocation module further includes an available queue acquisition element, the available queue acquisition element including:
the agent collection unit can acquire a source agent from a third party agent and put the source agent into an agent pool;
a rate detection unit capable of determining a response speed level of each of the source agents in the agent pool;
the available agent queue construction unit is capable of storing the source agents into the available agent queues at regular time using the response speed level of each of the source agents as a storage priority.
Yet another embodiment of the present application provides a terminal, including: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor executing the computer program to implement the method of determining a reusable agent queue described above.
In particular, the processor may be a CPU, general purpose processor, DSP, ASIC, FPGA or other programmable logic device, transistor logic device, hardware component, or any combination thereof. Which may implement or perform the various exemplary logic blocks, modules, and circuits described in connection with this disclosure. A processor may also be a combination that performs computing functions, e.g., including one or more microprocessors, a combination of a DSP and a microprocessor, and the like.
In particular, the processor is coupled to the memory via a bus, which may include a path for communicating information. The bus may be a PCI bus or an EISA bus, etc. The buses may be divided into address buses, data buses, control buses, etc.
The memory may be, but is not limited to, ROM or other type of static storage device, RAM or other type of dynamic storage device, which can store static information and instructions, EEPROM, CD-ROM or other optical disk storage, optical disk storage (including compact disk, laser disk, optical disk, digital versatile disk, blu-ray disc, etc.), magnetic disk storage or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer.
In the alternative, the memory is used for storing codes of a computer program for executing the scheme of the application, and the codes are controlled by the processor to execute the program. The processor is configured to execute the application code stored in the memory to implement any of several of the methods of fig. 1-4.
Yet another embodiment of the present application provides a computer readable storage medium storing computer executable instructions for performing the method of determining a reusable agent queue shown in fig. 1 described above.
The above described apparatus embodiments are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, i.e. may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
Those of ordinary skill in the art will appreciate that all or some of the steps, systems, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as known to those skilled in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media.
While the preferred embodiments of the present application have been described in detail, the present application is not limited to the above embodiments, and various equivalent modifications and substitutions can be made by those skilled in the art without departing from the spirit of the present application, and these equivalent modifications and substitutions are intended to be included in the scope of the present application as defined in the appended claims.
Claims (7)
1. A method of determining a reusable proxy queue, comprising the steps of:
acquiring an access request from a client;
determining a current proxy environment applicable to the access request includes:
determining a target website to be accessed by the access request, and acquiring a reusable agent environment queue corresponding to the target website;
if the reusable agent environment queue has available agent environments, selecting an agent environment from the reusable agent environment queue as a current agent environment;
if there is no available agent environment in the reusable agent environment queue, selecting an agent which is not in the black list or the gray list from the available agent queue as the current agent environment, including:
selecting agents which are not in a blacklist or a gray list from an available agent queue, wherein the blacklist records agents which are not applicable to the target website, and the gray list records agents which are being used;
combining the agent with the website information of the target website, and taking the combined result as a current agent environment corresponding to the target website;
sending the access request to a target website based on the current agent environment, and judging whether the current agent environment is suitable for the target website according to a response message of the target website;
if the current agent environment is suitable for the target website, storing the current agent environment into a reusable agent environment queue corresponding to the target website;
and if the current agent environment is not suitable for the target website, detecting the self state of the current agent environment and correspondingly processing the current agent environment according to the detection result.
2. The method of claim 1, wherein the step of determining the available proxy queue comprises:
obtaining a source agent from a third party agent and placing the source agent into an agent pool;
determining a response speed level of each source agent in the agent pool;
the response speed level of each source agent is used as a storage priority, and the source agents are stored in an available agent queue at regular time.
3. The method of claim 1, wherein the step of determining whether the current agent environment is applicable to the target web site based on the response message of the target web site comprises:
if the response status code in the response message is an environment failure status code, determining that the current agent environment is not suitable for the target website;
if the connection between the current agent environment and the target website cannot be established, judging that the current agent environment is not suitable for the target website;
if the current agent environment can be connected with the target website, but the response of the target website is received and overtime, judging that the current agent environment is not suitable for the target website;
if the response status code in the response message is a server abnormal status code, determining that the current agent environment is not suitable for the target website;
and otherwise, judging that the current agent environment is suitable for the target website.
4. The method according to claim 1, wherein if the current proxy environment is not suitable for the target website, the step of detecting the self state of the current proxy environment and correspondingly processing the current proxy environment according to the detection result comprises:
accessing a public network server through the current agent environment;
if the current agent environment can access the public network server, not processing;
if the current agent environment cannot establish network connection with a public network server, deleting the current agent environment from an available agent queue and writing the current agent environment into a blacklist;
if the current agent environment successfully establishes network connection with the public network server, but the response of the receiving server is overtime, deleting the current agent environment from the available agent queue and writing the current agent environment into a blacklist;
and if the current agent environment is connected with the public network server in a network mode, the response state code generates an abnormal state code, and the current agent environment is deleted from the available agent queue and written into a blacklist.
5. An apparatus for determining a reusable proxy queue, comprising:
the access request acquisition module can acquire an access request from a client;
a proxy context assignment module capable of determining a current proxy context applicable to the access request, comprising:
determining a target website to be accessed by the access request, and acquiring a reusable agent environment queue corresponding to the target website;
if the reusable agent environment queue has available agent environments, selecting an agent environment from the reusable agent environment queue as a current agent environment;
if there is no available agent environment in the reusable agent environment queue, selecting an agent which is not in the black list or the gray list from the available agent queue as the current agent environment, including:
selecting agents which are not in a blacklist or a gray list from an available agent queue, wherein the blacklist records agents which are not applicable to the target website, and the gray list records agents which are being used;
combining the agent with the website information of the target website, and taking the combined result as a current agent environment corresponding to the target website;
the response processing module can send the access request to a target website based on the current agent environment, and judge whether the current agent environment is suitable for the target website according to the response message of the target website;
the queue maintenance module is used for storing the current agent environment to a reusable agent environment queue corresponding to the target website if the current agent environment is suitable for the target website; and if the current agent environment is not suitable for the target website, detecting the self state of the current agent environment and correspondingly processing the current agent environment according to the detection result.
6. A terminal comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor executes the computer program to implement the method of any one of claims 1 to 4.
7. A computer readable storage medium storing computer executable instructions for performing the method of any one of claims 1 to 4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111142651.0A CN113905092B (en) | 2021-09-28 | 2021-09-28 | Method, device, terminal and storage medium for determining reusable agent queue |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111142651.0A CN113905092B (en) | 2021-09-28 | 2021-09-28 | Method, device, terminal and storage medium for determining reusable agent queue |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113905092A CN113905092A (en) | 2022-01-07 |
CN113905092B true CN113905092B (en) | 2024-03-22 |
Family
ID=79029634
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111142651.0A Active CN113905092B (en) | 2021-09-28 | 2021-09-28 | Method, device, terminal and storage medium for determining reusable agent queue |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113905092B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113923260B (en) * | 2021-09-28 | 2024-01-09 | 盐城天眼察微科技有限公司 | Method, device, terminal and storage medium for processing agent environment |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103294732A (en) * | 2012-03-05 | 2013-09-11 | 富士通株式会社 | Web page crawling method and spider |
CN107957999A (en) * | 2016-10-14 | 2018-04-24 | 北京国双科技有限公司 | A kind of web crawlers obtains the method and device of website data |
CN108345642A (en) * | 2018-01-12 | 2018-07-31 | 深圳壹账通智能科技有限公司 | Method, storage medium and the server of website data are crawled using Agent IP |
CN110851753A (en) * | 2019-11-07 | 2020-02-28 | 亿企赢网络科技有限公司 | Website access method, device, equipment and storage medium |
CN110875899A (en) * | 2018-08-30 | 2020-03-10 | 阿里巴巴集团控股有限公司 | Data processing method, system and network system |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8280031B2 (en) * | 2009-01-08 | 2012-10-02 | Soundbite Communications, Inc. | Method and system for managing interactive communications campaign using a hold queue |
US10609155B2 (en) * | 2015-02-20 | 2020-03-31 | International Business Machines Corporation | Scalable self-healing architecture for client-server operations in transient connectivity conditions |
US10193758B2 (en) * | 2016-04-18 | 2019-01-29 | International Business Machines Corporation | Communication via a connection management message that uses an attribute having information on queue pair objects of a proxy node in a switchless network |
-
2021
- 2021-09-28 CN CN202111142651.0A patent/CN113905092B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103294732A (en) * | 2012-03-05 | 2013-09-11 | 富士通株式会社 | Web page crawling method and spider |
CN107957999A (en) * | 2016-10-14 | 2018-04-24 | 北京国双科技有限公司 | A kind of web crawlers obtains the method and device of website data |
CN108345642A (en) * | 2018-01-12 | 2018-07-31 | 深圳壹账通智能科技有限公司 | Method, storage medium and the server of website data are crawled using Agent IP |
CN110875899A (en) * | 2018-08-30 | 2020-03-10 | 阿里巴巴集团控股有限公司 | Data processing method, system and network system |
CN110851753A (en) * | 2019-11-07 | 2020-02-28 | 亿企赢网络科技有限公司 | Website access method, device, equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN113905092A (en) | 2022-01-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109302498B (en) | Network resource access method and device | |
US8060920B2 (en) | Generating and changing credentials of a service account | |
US11108695B2 (en) | Method, system and device for adjusting load of resource server | |
CN109981702B (en) | File storage method and system | |
US20180278646A1 (en) | Early-Warning Decision Method, Node and Sub-System | |
CN113905092B (en) | Method, device, terminal and storage medium for determining reusable agent queue | |
CN111478792B (en) | Cutover information processing method, system and device | |
US20160196079A1 (en) | Reusing storage blocks of a file system | |
CN115756955A (en) | Data backup and data recovery method and device and computer equipment | |
CN110113187B (en) | Configuration updating method and device, configuration server and configuration system | |
CN113923260B (en) | Method, device, terminal and storage medium for processing agent environment | |
CN114221807B (en) | Access request processing method, device, monitoring equipment and storage medium | |
US11082484B2 (en) | Load balancing system | |
US10397312B2 (en) | Automated server deployment platform | |
CN113300966A (en) | Flow control method, device and system and electronic equipment | |
CN112153036A (en) | Security defense method and system based on proxy server | |
CN110955579A (en) | Ambari-based large data platform monitoring method | |
CN113742664B (en) | Monitoring and auditing method, equipment and system | |
CN112527521B (en) | Message processing method and device | |
CN117093639B (en) | Socket connection processing method and system based on audit service | |
CN115550222B (en) | Method, system, terminal and storage medium for detecting abnormal state of equipment | |
CN109710179B (en) | Storage service quality control method and device | |
CN117811778A (en) | Access request control method and device | |
CN115658760A (en) | Data processing method, device, server and computer readable storage medium | |
CN115357582A (en) | Data merging method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20230801 Address after: Room 404-405, 504, Building B-17-1, Big data Industrial Park, Kecheng Street, Yannan High tech Zone, Yancheng, Jiangsu Province, 224000 Applicant after: Yancheng Tianyanchawei Technology Co.,Ltd. Address before: 224000 room 501-503, building b-17-1, Xuehai road big data Industrial Park, Kecheng street, Yannan high tech Zone, Yancheng City, Jiangsu Province Applicant before: Yancheng Jindi Technology Co.,Ltd. |
|
GR01 | Patent grant | ||
GR01 | Patent grant |