CN112256944A - Automatic website data crawling method based on JMeter - Google Patents

Automatic website data crawling method based on JMeter Download PDF

Info

Publication number
CN112256944A
CN112256944A CN202011156240.2A CN202011156240A CN112256944A CN 112256944 A CN112256944 A CN 112256944A CN 202011156240 A CN202011156240 A CN 202011156240A CN 112256944 A CN112256944 A CN 112256944A
Authority
CN
China
Prior art keywords
data
crawling
request
interface
jmeter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011156240.2A
Other languages
Chinese (zh)
Inventor
杨雪梅
唐军
刘楚雄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Changhong Electric Co Ltd
Original Assignee
Sichuan Changhong Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Changhong Electric Co Ltd filed Critical Sichuan Changhong Electric Co Ltd
Priority to CN202011156240.2A priority Critical patent/CN112256944A/en
Publication of CN112256944A publication Critical patent/CN112256944A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the field of web, in particular to a JMeter-based website data automatic crawling method, which avoids a large amount of complex JS operations involved in a process of crawling data from a front-end interface, and also avoids a long data crawling process or direct crawling failure caused by the limitation of HTTP request times and access frequency of some websites within a certain time. The technical scheme includes that a target website needing data crawling is determined, then data analysis is conducted on the target website, a data interface and attribute information corresponding to the data interface are obtained, a data interface is executed at a JMeter end, whether request parameters and corresponding response results in the data interface meet expected settings or not is checked, parameter dynamic configuration is conducted on the data interface if the request parameters and the corresponding response results meet the expected settings, field parameters are extracted from responses of the data interface and dynamically configured, an output target file is dynamically configured, and after the corresponding dynamic configuration is set, a reverse crawling mechanism is set to start data crawling. The method and the device are applied to automatic crawling of the website data.

Description

Automatic website data crawling method based on JMeter
Technical Field
The invention relates to the field of web, in particular to a method for automatically crawling website data based on JMeter.
Background
The website data crawler is a program for automatically extracting website page data, and can capture and store specific data displayed on a website page into a local file or a database for other projects or development of certain specific functions, such as acquisition of video resources of various movie websites, acquisition of commodity name prices of various shopping websites, acquisition of title contents of various novel website articles, and the like. The crawler is widely applied to practical projects, and plays an irreplaceable role in many web development projects and data support.
The existing website page data crawling is based on page element extraction of a front-end interface, and has the advantages that the data are visual and visual, the data needing to be crawled can be more clearly determined, and the defects are obvious.
In the existing website data crawling process, as shown in fig. 1, a browser simulating a user request is built locally, and an HTTP request is sent through the browser to obtain an HTML webpage required by a service. After the browser finishes loading the HTML webpage, the browser can continue to send the HTTP request to load the JS file embedded in the HTML webpage, render the webpage, and after the browser finishes loading the JS file, codes can be written to simulate mouse operation of a real user. Corresponding information can be obtained by completing relevant simulation operation.
However, the scheme has the following obvious problems that firstly, many websites have limitations on the number of HTTP requests and the access frequency within a certain time, so that data crawling is easy to fail, and the whole process is longer; secondly, data crawling requiring JS operation on page elements can trigger multiple HTTP requests to load JS files embedded in the web pages, and especially when the content of nested pages with irregular front-end page elements is obtained, multiple loading of web page data can be caused, and a large number of network resources are consumed to enable the data crawling to be more complicated.
Disclosure of Invention
The invention aims to provide a JMeter-based website data automatic crawling method, which avoids a large amount of complex JS operations involved in a data crawling process from a front-end interface, and also avoids the problem that the data crawling process is long or the data crawling fails directly due to the limitation of some websites on HTTP request times and access frequency within a certain time.
The invention adopts the following technical scheme to realize the purpose, and the automatic website data crawling method based on JMeter comprises the following steps:
step (1), determining a target website needing data crawling;
step (2), carrying out data analysis on the target website to obtain a data interface and attribute information corresponding to the data interface;
step (3), executing a data interface at the JMeter end, checking whether the request parameters and the corresponding response results in the data interface conform to expected settings, if so, entering step (4), otherwise, debugging the data interface at the JMeter end;
step (4), dynamically configuring parameters of the data interface, dynamically configuring response extraction field parameters of the data interface, and dynamically configuring an output target file;
step (5), after setting the corresponding dynamic configuration, setting a reverse climbing mechanism;
and (6) crawling data in batches, outputting and saving the data to a target file.
Further, in step (2), the attribute information corresponding to the data interface includes: request address, request parameters, request type, request header, and request body.
Further, in step (4), dynamically configuring the data interface includes dynamically configuring parameters in the form of variables.
Further, in step (4), the specific method for extracting the field parameter includes: and adding a post processor behind the data interface, and selecting a JSON extractor and/or a regular expression extractor and/or an XPath extractor to extract parameters.
Further, in step (4), the specific method for dynamically configuring the output target file includes: and adding user parameters or custom variables before the request is executed, and correspondingly configuring the file path and the file name.
Further, in the step (5), a specific method for setting the back-climbing mechanism comprises the following steps: a fixed timer is added under a request execution catalog, the time of the fixed timer is random and variable and is always between 100ms and 1s, each interface request waits for a period of random time to operate, and irregular requests of a user at different times are simulated by setting different interval time for the request execution, so that the system is prevented from shielding.
Further, in step (6), the process of crawling data in batches further includes preventing the same request from being repeatedly executed, and the specific method for preventing the same request from being repeatedly executed includes: target data and target page numbers needing to be crawled are found through analysis of interface response data, a circulation controller is arranged at a request page number level, circulation times are set through the circulation controller according to the target page numbers, a counter is added under the circulation controller, the increment is set to be 1, the counter is automatically added once the request is executed, and the execution is finished when the output value of the counter is equal to the target page numbers.
Further, in the step (6), the process of crawling data in batch further includes crawling of nested web page data, where the nested page includes a first-level page and a second-level page, and the specific method for crawling of the nested web page data includes:
step 601, executing a first-level page interface, and acquiring all commodity identifications in a current page list through a JSON extractor to obtain a commodity identification array;
step 602, adding a ForEach logic controller under a hierarchical directory of the cyclic controller and the page request, wherein the input of the logic controller is a commodity identification array, and the output is a specific identification of each commodity;
and 603, circulating the specific identification of each commodity and the corresponding commodity detail interface request through the ForEach logic controller, and completing target data storage of the nested page through JSON extraction and file output of a post processor.
Further, in step (6), the specific method for crawling data in batches and outputting and saving the data includes: adding a post processor BeanShell Postprocessor under a request data level, acquiring parameters by a vars.get method, expanding a commodity identification array in the BeanShell Postprocessor to obtain target data, and sequentially storing the target data in a target file.
The invention is based on the response data extraction of the back-end interface, thus avoiding a large amount of complex JS operations involved in the process of crawling data from the front-end interface; a reverse crawling mechanism is set, the interval time of data crawling is defined in a user-defined mode, and the problem that the data crawling process is long or the data crawling fails directly due to the fact that some websites limit the HTTP request times and the access frequency within a certain time is avoided; the data interface configuration based on JMeter prevents the repeated execution of the same request and the capture of nested webpage data, and greatly improves the efficiency of data crawling.
Drawings
Fig. 1 is a schematic flow chart of a method for crawling data through a front-end interface in the prior art.
FIG. 2 is a flow chart of the method for automatically crawling website data based on JMeter according to the present invention.
Detailed Description
The invention discloses a JMeter-based website data automatic crawling method, which comprises the following steps:
step (1), determining a target website needing data crawling;
step (2), carrying out data analysis on the target website to obtain a data interface and attribute information corresponding to the data interface;
step (3), executing a data interface at the JMeter end, checking whether the request parameters and the corresponding response results in the data interface conform to expected settings, if so, entering step (4), otherwise, debugging the data interface at the JMeter end;
step (4), dynamically configuring parameters of the data interface, dynamically configuring response extraction field parameters of the data interface, and dynamically configuring an output target file;
step (5), after setting the corresponding dynamic configuration, setting a reverse climbing mechanism;
and (6) crawling data in batches, outputting and saving the data to a target file.
In step (2), the attribute information corresponding to the data interface includes: request address, request parameters, request type, request header, and request body.
In the step (4), the dynamic configuration of the data interface includes dynamic configuration of parameters in a variable form.
In the step (4), the specific method for extracting the field parameters comprises the following steps: and adding a post processor behind the data interface, and selecting a JSON extractor and/or a regular expression extractor and/or an XPath extractor to extract parameters.
In the step (4), the specific method for dynamically configuring the output target file includes: and adding user parameters or custom variables before the request is executed, and correspondingly configuring the file path and the file name.
In the step (5), the specific method for setting the reverse climbing mechanism comprises the following steps: a fixed timer is added under a request execution catalog, the time of the fixed timer is random and variable and is always between 100ms and 1s, each interface request waits for a period of random time to operate, and irregular requests of a user at different times are simulated by setting different interval time for the request execution, so that the system is prevented from shielding.
In step (6), the process of crawling data in batches further includes preventing the same request from being repeatedly executed, and the specific method for preventing the same request from being repeatedly executed includes: target data and target page numbers needing to be crawled are found through analysis of interface response data, a circulation controller is arranged at a request page number level, circulation times are set through the circulation controller according to the target page numbers, a counter is added under the circulation controller, the increment is set to be 1, the counter is automatically added once the request is executed, and the execution is finished when the output value of the counter is equal to the target page numbers.
In step (6), the process of crawling data in batch further includes crawling of nested web page data, the nested page includes a first-level page and a second-level page, and the specific method for crawling of the nested web page data includes:
step 601, executing a first-level page interface, and acquiring all commodity identifications in a current page list through a JSON extractor to obtain a commodity identification array;
step 602, adding a ForEach logic controller under a hierarchical directory of the cyclic controller and the page request, wherein the input of the logic controller is a commodity identification array, and the output is a specific identification of each commodity;
and 603, circulating the specific identification of each commodity and the corresponding commodity detail interface request through the ForEach logic controller, and completing target data storage of the nested page through JSON extraction and file output of a post processor.
In the step (6), the specific method for crawling data in batches and outputting and storing the data comprises the following steps: adding a post processor BeanShell Postprocessor under a request data level, acquiring parameters by a vars.get method, expanding a commodity identification array in the BeanShell Postprocessor to obtain target data, and sequentially storing the target data in a target file.
The following describes the present scheme in further detail with reference to fig. 2 and a specific embodiment, and the specific work flow of the method for automatically crawling website data based on JMeter according to the present invention is as follows:
the method comprises the following steps of target interface acquisition and debugging, for specific website data acquisition, the first step is to take an interface for data generation, taking a shopping website as an example, if a user needs to crawl a commodity title, a commodity introduction, a commodity address, a commodity price and 5 parameters of the current time and commodity evaluation contents under each commodity detail page for later data comparison or project data support, finding a source interface of data through page data analysis and an F12 developer tool, and acquiring related attributes of the interface: the method comprises the steps of executing an acquired interface at a JMeter end, checking whether the content of request parameters and response results meets expectations or not, and debugging a data interface at the JMeter end if the content of the request parameters and the content of the response results do not meet the expectations, wherein a commodity list page is a first-level page, a commodity detail page is a second-level page, and the two pages are combined into a nested page.
Interface configuration, take the interface smoothly after, need consider next how to go to realize a batch of data and acquire and save the operation, reach the purpose of automatic reptile, at first need accomplish leading operation: dynamic configuration of interface parameters, dynamic configuration of interface response extraction field parameters and dynamic configuration of output files.
The purpose of parameter transmission configuration is to enable the execution of the interface to be more flexible, when data crawling of website data possibly involving a plurality of types, a plurality of states and a plurality of page numbers is acquired, the interface is not written to be dead, parameter configuration is performed through a variable form, so that a crawling script can be more flexible, if a woman, a child and a jacket are selected according to the classification of a current shopping website, the data of the previous 10 pages are arranged according to the sales volume, and then the types can be: data such as women, children, coats, sales volume, page number and the like are configured on the JMeter end in a variable mode, so that modification of types and the like at a later stage becomes easier.
The configuration of the interface response is variable, and the actual configuration needs to be performed according to the actual scene, for example, the target data of the current level page includes: adding a post processor behind an interface, and selecting a JSON extractor/regular expression extractor/XPath extractor to extract parameters, wherein the JSON extractor is used for extracting target parameters, the JSON extractor is added according to the number of parameters to be extracted, and the title, introduction, address, price and time parameters of all commodities are extracted in an array format.
Such as the parameter title, and the directory structure of the data is:
data- > [ { goodsId1, title1, content1, site1, price1, time1}, { goodsId2, title2, content 2, site2, price2, time2}. ], the format of the extractor: $. data.
After being extracted in this way: title _1 is the first title, title _2 is the second title, and so on until all the product titles are traversed, the same way is used for extracting the product introduction, the address, the price and the time, and for XXX _1, the format is a JMeter fixed array data extraction format and is not written in an excessively entangled way.
For the output file and address of the last crawled data, definition should be performed before the saving operation is performed, so that the configuration of the directory and the file name of the file is more flexible, specifically, user parameters or custom variables are added before the request is performed, and the file path and the file name are specifically configured. Such as address: path E: \ \ crawlingtest.
The invention provides a simple anti-crawling mechanism, which can simulate the irregular request of a user at different time by setting different interval time for the request execution and prevent the request from being shielded by the system. Specifically, a fixed timer is added under a request execution directory, the time of the fixed timer is Random and variable and is always between 100ms and 1s, so that each interface request waits for a Random time to operate, on one hand, the requests can be prevented from reaching the server in a large amount to form pressure when batch data is crawled, on the other hand, the time access limit set by simple websites can be bypassed, and a reverse crawling effect is achieved.
In the data crawling process, repeated execution of the same request needs to be prevented, crawling of data is often large in quantity and multiple in pages, but for content crawling of more complex webpage parameters, a corresponding JS component needs to be loaded, and repeated execution of the same request is easily caused when the data of a complex front-end page is involved. Therefore, the invention provides a method for capturing target data from a back-end interface response, which avoids repeated execution of the same request caused by repeated loading of a JS file at the front end, and particularly finds target data and target page numbers to be crawled through analysis of interface response data, if the content of the first 10 pages of a shopping website needs to be crawled according to sales, namely totalpage is 10, all page numbers can be directly obtained through an interface, and data of all page numbers can be crawled. After totalpage is determined, the cycle number can be set directly through a cycle controller: for 10 pages of data, 10 requests are executed, and each request will obtain all target data of the current page. In order to prevent the request from being repeatedly executed, a counter Maximum value is added under the loop controller, the increment is set to be 1, so that the counter is automatically increased by one at the end of each request, the value of the counter is output to be num, parameter is transmitted through an interface, and num can be transmitted to a page number variable in the request, so that the data of each page is executed until the whole loop is ended, and the execution is only 1 time.
In the data crawling process, the nested web pages also need to be configured so as to perform data crawling, and the simple nested pages may include a first-level page and a second-level page, for example: and (3) searching jackets in a certain shopping website, wherein the search result is a first-level page, any result in the list is clicked, and the entered detail page is a second-level page.
The format of the request address of the commodity detail page is as follows: protocol type// server address/path/goods id.
If the commodity evaluation under the commodity details needs to be captured, the specific implementation is data capture of a nested webpage. Executing a first-level page interface, acquiring all commodity identifications goodsId in a current list through a JSON extractor, wherein the acquired result is an array, adding a ForEach logic controller under the hierarchical directory of the cyclic controller and the specific page number request, and inputting the controller into the goodsId array and outputting the controller into id, so that id1 is the identification of a first commodity, and id2 is the identification of a second commodity. And circulating the specific identification of each commodity and the corresponding commodity detail interface request through the ForEach logic controller, and successfully finishing the target data storage of the nested page through the JSON extraction and file output of the post processor.
Through the automatic execution request of the cycle controller, the batch test is completed until the cycle is finished, the manual data record storage is avoided, and the data crawling efficiency is improved. The result storage is realized through Java codes, the Java codes can be directly embedded in the JMeter, and the realization is simpler. The extraction of the target data which we want to crawl is completed through the request cycle control and counting in the steps and the JSON extractor of the interface adding post processor in the steps, and then the processing and the storage of the crawled data are realized. The specific operation is that a post processor, namely the BeanShell Postprocessor, is added under a request, the parameter variable is obtained through a vars.get method, and the data extractor stores an array, so that the array needs to be expanded in the BeanShell Postprocessor to obtain target data, and the target data is sequentially stored in the file path configured in the step.
And (3) output file configuration: file ═ new File ("vars.get (" path "));
crawling data array length determination: get, get ("name _ matchNr");
and (4) circularly outputting to the target file: for (int i ═ 1; i < ═ integer.
The method is particularly suitable for data crawling of a single page or a nested page of a website, and the core idea of the method is also suitable for crawling of all data contents responded by interfaces, such as APP WeChat small programs and the like.
In conclusion, the method and the device avoid a large amount of complex JS operations involved in the data crawling process from the front-end interface, avoid the data crawling process from being lengthy or directly failed due to the limitation of HTTP request times and access frequency of some websites within a certain time, and improve the data crawling efficiency.

Claims (9)

1. A method for automatically crawling website data based on JMeter is characterized by comprising the following steps:
step (1), determining a target website needing data crawling;
step (2), carrying out data analysis on the target website to obtain a data interface and attribute information corresponding to the data interface;
step (3), executing a data interface at the JMeter end, checking whether the request parameters and the corresponding response results in the data interface conform to expected settings, if so, entering step (4), otherwise, debugging the data interface at the JMeter end;
step (4), dynamically configuring parameters of the data interface, dynamically configuring response extraction field parameters of the data interface, and dynamically configuring an output target file;
step (5), after setting the corresponding dynamic configuration, setting a reverse climbing mechanism;
and (6) crawling data in batches, outputting and saving the data to a target file.
2. The JMeter-based website data automatic crawling method as claimed in claim 1, wherein in step (2), the attribute information corresponding to the data interface comprises: request address, request parameters, request type, request header, and request body.
3. The JMeter-based website data auto-crawling method as claimed in claim 1, wherein in step (4), the dynamic configuration of the data interface comprises dynamic configuration of parameters in the form of variables.
4. The JMeter-based website data automatic crawling method as claimed in claim 1, wherein in step (4), the specific method for extracting the field parameters comprises: and adding a post processor behind the data interface, and selecting a JSON extractor and/or a regular expression extractor and/or an XPath extractor to extract parameters.
5. The JMeter-based website data automatic crawling method as claimed in claim 1, wherein in step (4), the specific method for dynamically configuring the output target file comprises: and adding user parameters or custom variables before the request is executed, and correspondingly configuring the file path and the file name.
6. The method for automatically crawling JMeter-based website data as claimed in claim 1, wherein in step (5), the specific method for setting the reverse crawling mechanism comprises: a fixed timer is added under a request execution catalog, the time of the fixed timer is random and variable and is always between 100ms and 1s, each interface request waits for a period of random time to operate, and irregular requests of a user at different times are simulated by setting different interval time for the request execution, so that the system is prevented from shielding.
7. The JMeter-based website data automatic crawling method as claimed in claim 4, wherein in step (6), the process of crawling data in batches further comprises preventing the same request from being repeatedly executed, and the specific method for preventing the same request from being repeatedly executed comprises: target data and target page numbers needing to be crawled are found through analysis of interface response data, a circulation controller is arranged at a request page number level, circulation times are set through the circulation controller according to the target page numbers, a counter is added under the circulation controller, the increment is set to be 1, the counter is automatically added every time the request is executed, and the execution is finished when the output value of the counter is equal to the target page numbers.
8. The JMeter-based website data automatic crawling method as claimed in claim 7, wherein in step (6), the process of crawling data in batch further comprises crawling of nested webpage data, the nested pages comprise a first level page and a second level page, and the specific method for crawling of nested webpage data comprises:
step 601, executing a first-level page interface, and acquiring all commodity identifications in a current page list through a JSON extractor to obtain a commodity identification array;
step 602, adding a ForEach logic controller under a hierarchical directory of the cyclic controller and the page request, wherein the input of the logic controller is a commodity identification array, and the output is a specific identification of each commodity;
and 603, circulating the specific identification of each commodity and the corresponding commodity detail interface request through the ForEach logic controller, and completing target data storage of the nested page through JSON extraction and file output of a post processor.
9. The JMeter-based website data automatic crawling method as claimed in claim 8, wherein in step (6), the specific method for crawling data in batches and outputting and saving the data comprises: adding a post processor BeanShell Postprocessor under a request data level, acquiring parameters by a vars.get method, expanding a commodity identification array in the BeanShell Postprocessor to obtain target data, and sequentially storing the target data in a target file.
CN202011156240.2A 2020-10-26 2020-10-26 Automatic website data crawling method based on JMeter Pending CN112256944A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011156240.2A CN112256944A (en) 2020-10-26 2020-10-26 Automatic website data crawling method based on JMeter

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011156240.2A CN112256944A (en) 2020-10-26 2020-10-26 Automatic website data crawling method based on JMeter

Publications (1)

Publication Number Publication Date
CN112256944A true CN112256944A (en) 2021-01-22

Family

ID=74262037

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011156240.2A Pending CN112256944A (en) 2020-10-26 2020-10-26 Automatic website data crawling method based on JMeter

Country Status (1)

Country Link
CN (1) CN112256944A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113254749A (en) * 2021-06-10 2021-08-13 山东浪潮通软信息科技有限公司 Data crawling method and device based on http protocol
CN113836450A (en) * 2021-11-30 2021-12-24 垒知科技集团四川有限公司 Data interface generation method for acquiring XPATH based on visual operation
TWI781839B (en) * 2021-12-02 2022-10-21 中華電信股份有限公司 Electronic device and method for inspecting product checkout loophole of website

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109408379A (en) * 2018-09-30 2019-03-01 福建星瑞格软件有限公司 One kind is based on promotion jmeter interface automatic test data matching method
CN110297774A (en) * 2019-07-02 2019-10-01 四川长虹电器股份有限公司 A kind of automatic interface testing method based on python
CN110597721A (en) * 2019-09-11 2019-12-20 四川长虹电器股份有限公司 Automatic interface pressure testing method based on pressure testing script
CN111737137A (en) * 2020-06-24 2020-10-02 重庆紫光华山智安科技有限公司 Interface test data generation method and device, host and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109408379A (en) * 2018-09-30 2019-03-01 福建星瑞格软件有限公司 One kind is based on promotion jmeter interface automatic test data matching method
CN110297774A (en) * 2019-07-02 2019-10-01 四川长虹电器股份有限公司 A kind of automatic interface testing method based on python
CN110597721A (en) * 2019-09-11 2019-12-20 四川长虹电器股份有限公司 Automatic interface pressure testing method based on pressure testing script
CN111737137A (en) * 2020-06-24 2020-10-02 重庆紫光华山智安科技有限公司 Interface test data generation method and device, host and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王大力测试进阶之路: "Jmeter+ForEach控制器+BeanShell PostProcessor爬取链家网信息储存csv", 《HTTPS://CLOUD.TENCENT.COM/DEVELOPER/ARTICLE/1527465》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113254749A (en) * 2021-06-10 2021-08-13 山东浪潮通软信息科技有限公司 Data crawling method and device based on http protocol
CN113254749B (en) * 2021-06-10 2022-08-23 浪潮通用软件有限公司 Data crawling method and device based on http protocol
CN113836450A (en) * 2021-11-30 2021-12-24 垒知科技集团四川有限公司 Data interface generation method for acquiring XPATH based on visual operation
TWI781839B (en) * 2021-12-02 2022-10-21 中華電信股份有限公司 Electronic device and method for inspecting product checkout loophole of website

Similar Documents

Publication Publication Date Title
CN110297759B (en) Method, device, equipment and storage medium for manufacturing test page script
CN112256944A (en) Automatic website data crawling method based on JMeter
CN107729475B (en) Webpage element acquisition method, device, terminal and computer-readable storage medium
CN101488151B (en) System and method for gathering website contents
CN112099768A (en) Business process processing method and device and computer readable storage medium
CN108228873A (en) Object recommendation, publication content delivery method, device, storage medium and equipment
CN109376291B (en) Website fingerprint information scanning method and device based on web crawler
CN108304410A (en) A kind of detection method, device and the data analysing method of the abnormal access page
CN107644100B (en) Information processing method, device and system and computer readable storage medium
CN105138312A (en) Table generation method and apparatus
CN109308251B (en) Test data verification method and device
CN107276842B (en) Interface test method and device and electronic equipment
CN110507986B (en) Animation information processing method and device
CN111090797B (en) Data acquisition method, device, computer equipment and storage medium
CN110795085A (en) Mobile application visual editing method and tool
CN106776318A (en) A kind of test script method for recording and system
CN110851756A (en) Page loading method and device, computer readable storage medium and terminal equipment
CN103838862A (en) Video searching method, device and terminal
CN113296653A (en) Simulation interaction model construction method, interaction method and related equipment
CN110399063B (en) Method and device for viewing page element attributes
CN116069577A (en) Interface testing method, equipment and medium for RPC service
CN111881043B (en) Page testing method and device, storage medium and processor
CN111026947B (en) Crawler method and embedded crawler implementation method based on browser
CN112667934A (en) Dynamic simulation diagram display method and device, electronic equipment and computer readable medium
CN114942878A (en) Automatic performance testing method for internet application and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210122