CN114611039B - Analysis method and device of asynchronous loading rule, storage medium and electronic equipment - Google Patents

Analysis method and device of asynchronous loading rule, storage medium and electronic equipment Download PDF

Info

Publication number
CN114611039B
CN114611039B CN202210177370.7A CN202210177370A CN114611039B CN 114611039 B CN114611039 B CN 114611039B CN 202210177370 A CN202210177370 A CN 202210177370A CN 114611039 B CN114611039 B CN 114611039B
Authority
CN
China
Prior art keywords
target interface
asynchronous loading
data
response data
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210177370.7A
Other languages
Chinese (zh)
Other versions
CN114611039A (en
Inventor
薛秋雨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yancheng Tianyanchawei Technology Co ltd
Original Assignee
Yancheng Tianyanchawei Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yancheng Tianyanchawei Technology Co ltd filed Critical Yancheng Tianyanchawei Technology Co ltd
Priority to CN202210177370.7A priority Critical patent/CN114611039B/en
Publication of CN114611039A publication Critical patent/CN114611039A/en
Application granted granted Critical
Publication of CN114611039B publication Critical patent/CN114611039B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The embodiment of the disclosure discloses an analysis method and a device for asynchronous loading rules, a storage medium and electronic equipment, wherein the method comprises the following steps: sending a webpage data asynchronous loading request; response data returned by the server side aiming at the webpage data asynchronous loading request is obtained, and a new target interface is generated by rendering the target interface in the webpage based on the response data; extracting a corresponding analysis field from the newly generated target interface; and analyzing the source code of the response data based on the extracted analysis field to obtain the asynchronous loading rule of the newly generated target interface.

Description

Analysis method and device of asynchronous loading rule, storage medium and electronic equipment
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a method and apparatus for parsing an asynchronous loading rule, a storage medium, an electronic device, and a computer program product.
Background
In an actual business scenario, for asynchronously loaded data, a data crawling rule of each webpage needs to be manually determined to obtain the data. When large-scale crawlers are performed, data sources of each dimension which need to be asynchronously loaded can come from tens of thousands of website sites, and in this case, the data crawling rule of each webpage is determined manually, which tends to cause that asynchronously loaded data cannot be obtained efficiently.
Disclosure of Invention
Based on this, it is necessary to provide an parsing method, apparatus, storage medium, electronic device and computer program product for the existing asynchronous loading data, which needs to manually parse its data loading rule (the asynchronous data loading rule is the data crawling rule of the crawler) and has low efficiency in the manual parsing process.
In a first aspect, an embodiment of the present disclosure provides a method for parsing an asynchronous loading rule, where the method includes:
sending a webpage data asynchronous loading request;
response data returned by the server side aiming at the webpage data asynchronous loading request is obtained, and a new target interface is generated by rendering the target interface in the webpage based on the response data;
extracting a corresponding analysis field from the newly generated target interface; and
and analyzing the source code of the response data based on the extracted analysis field to obtain the asynchronous loading rule of the newly generated target interface.
In one embodiment, sending a web page data asynchronous load request includes:
acquiring a webpage data asynchronous loading request sent to the server by a client intercepted by an interceptor; and
And simulating and sending the intercepted webpage data asynchronous loading request to initiate the webpage data asynchronous loading request to the server.
In one embodiment, the new target interface is determined by operations comprising:
before rendering a target interface in a page based on the response data to generate a new target interface, acquiring webpage source codes of the page;
after rendering the target interface in the page based on the response data to generate a new target interface, acquiring the webpage source code of the page again; and
and determining the new target interface by comparing the web page source codes of the two acquired pages.
In one embodiment, extracting the corresponding parsing field from the newly generated target interface includes:
dividing the newly generated target interface into at least one row block;
extracting a corresponding data fingerprint from any one of the at least one row block;
and disassembling the data fingerprint into at least one analysis field.
In one embodiment, extracting the corresponding parsing field from the newly generated target interface includes:
dividing the newly generated target interface into at least one row block;
For each of the at least one row block, extracting a corresponding data fingerprint and breaking down each data fingerprint into a corresponding at least one parsing field.
In one embodiment, parsing the source code of the response data based on the extracted parsing field to obtain the asynchronous loading rule of the newly generated target interface includes:
searching the source code of the response data in a preset recursion mode to determine the leaf node of each analysis field in the at least one analysis field, thereby obtaining at least one corresponding leaf node;
continuing searching the source code of the response data to determine a common parent node of the at least one leaf node;
determining a data group corresponding to the current line block through the public father node;
determining an asynchronous loading rule of the current line block based on a data set corresponding to the current line block; and
and deducing the asynchronous loading rule of the newly generated target interface based on the asynchronous loading rule of the current line block.
In one embodiment, continuing to search the source code of the response data to determine a common parent node of the at least one leaf node includes:
Continuing searching the source code of the response data to determine the father node of each leaf node in the at least one leaf node; and
and in the case that the parent nodes of the leaf nodes are the same node, the same node is used as a common parent node of the at least one leaf node.
In one embodiment, the method further comprises:
and in the case that the father node of each leaf node is a different node, determining the father node with the shortest path to the root node in the father nodes of each leaf node as the common father node of the at least one leaf node.
In one embodiment, the following is performed in the case where the response data is JSON data:
extracting a corresponding analysis field from the newly generated target interface; and
and analyzing the source code of the response data based on the extracted analysis field to obtain the asynchronous loading rule of the newly generated target interface.
In one embodiment, parsing the source code of the response data based on the extracted parsing field to obtain the asynchronous loading rule of the newly generated target interface includes:
analyzing the source code of the response data based on at least one analysis field extracted from any one of the at least one line block to obtain an asynchronous loading rule of the any one line block;
Resolving the source code of the response data based on at least one resolving field extracted from another arbitrary line block in the at least one line block to obtain an asynchronous loading rule of the another arbitrary line block; and
and under the condition that the two asynchronous loading rules are consistent, the asynchronous loading rule of any row of blocks is the asynchronous loading rule of the newly generated target interface.
In one embodiment, sending a web page data asynchronous load request includes:
based on the automatic page turning rule, the corresponding page turning control is automatically triggered to send a webpage data asynchronous loading request.
In a second aspect, an embodiment of the present disclosure provides an apparatus for parsing an asynchronously loaded rule, where the apparatus includes:
the sending unit is used for sending a webpage data asynchronous loading request;
the acquiring and rendering unit is used for acquiring response data returned by the server side aiming at the webpage data asynchronous loading request, and rendering a target interface in a page based on the response data so as to generate a new target interface;
the extraction unit is used for extracting a corresponding analysis field from the newly generated target interface; and
and the analysis unit is used for analyzing the source code of the response data based on the extracted analysis field so as to obtain the asynchronous loading rule of the newly generated target interface.
In a third aspect, an embodiment of the present disclosure provides an electronic device, including:
a processor;
a memory for storing the processor-executable instructions;
the processor is configured to read the executable instructions from the memory and execute the executable instructions to implement the method steps described above.
In a fourth aspect, the presently disclosed embodiments provide a computer readable storage medium storing a computer program for performing the above-described method steps.
In a fifth aspect, the disclosed embodiments provide a computer program product comprising a computer program which, when executed by a processor, implements the above-described method steps.
In the embodiment of the disclosure, a webpage data asynchronous loading request is sent; response data returned by the server side aiming at the webpage data asynchronous loading request is obtained, and a new target interface is generated by rendering the target interface in the webpage based on the response data; extracting a corresponding analysis field from the newly generated target interface; and analyzing the source code of the response data based on the extracted analysis field to obtain the asynchronous loading rule of the newly generated target interface. The parsing method provided by the embodiment of the disclosure can automatically parse the source code of the response data based on the extracted parsing field so as to obtain the asynchronous loading rule of the newly generated target interface; therefore, automatic analysis can be realized, and the whole analysis process does not need to be manually participated, so that the analysis efficiency of the asynchronous loading rule is greatly improved; in addition, as the whole process is automatic analysis, possible manual analysis errors are avoided, and therefore the analysis accuracy of the asynchronous loading rule can be effectively improved.
Drawings
Exemplary embodiments of the present invention may be more fully understood by reference to the following drawings. The accompanying drawings are included to provide a further understanding of embodiments of the disclosure and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the disclosure, not to limit the invention. In the drawings, like reference numerals generally refer to like parts or steps.
FIG. 1 is a flow chart of a method of parsing asynchronous loading rules provided in accordance with an exemplary embodiment of the present application;
fig. 2 is a schematic diagram of extracting a corresponding parsing field from a target interface according to a parsing method in a specific application scenario according to an exemplary embodiment of the present application;
fig. 3 is a schematic diagram of a code corresponding to a process of extracting a parsing field in a specific application scenario according to an exemplary embodiment of the present application;
fig. 4 is another code schematic corresponding to a process of extracting a parsed field in a specific application scenario according to an exemplary embodiment of the present application;
FIG. 5 is a schematic diagram of a structure of an apparatus 200 for parsing asynchronous loading rules according to an exemplary embodiment of the present application;
FIG. 6 illustrates a schematic diagram of an electronic device provided in an exemplary embodiment of the present application;
Fig. 7 shows a schematic diagram of a computer-readable medium according to an exemplary embodiment of the present application.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
It is noted that unless otherwise indicated, technical or scientific terms used herein should be given the ordinary meaning as understood by one of ordinary skill in the art to which this application belongs.
In addition, the terms "first" and "second" etc. are used to distinguish different objects and are not used to describe a particular order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus.
Referring to fig. 1, which is a flowchart illustrating a method for parsing an asynchronous loading rule according to some embodiments of the present application, as shown in fig. 1, the method for parsing an asynchronous loading rule may include the following steps:
step S101: and sending a webpage data asynchronous loading request.
In one possible implementation, sending a web page data asynchronous load request includes the steps of:
acquiring a webpage data asynchronous loading request sent to a server by a client intercepted by an interceptor; and
and simulating and sending the intercepted webpage data asynchronous loading request to initiate the webpage data asynchronous loading request to the server.
It should be understood that, in this embodiment, the interceptor may be disposed on the client, and configured to intercept the asynchronous loading request of the web page data sent to the server by the client, and forward the request to the server, and simultaneously save the intercepted asynchronous loading request of the web page data locally, and further intercept the response returned by the server for the request, and forward the response to the client, and store the response locally in association with the request.
Step S102: response data returned by the server side aiming at the webpage data asynchronous loading request is obtained, and the target interface in the webpage is rendered based on the response data so as to generate a new target interface.
In one possible implementation, the new target interface is determined by the following operations, including the steps of:
before rendering a target interface in a page based on response data to generate a new target interface, acquiring a webpage source code of the page;
after rendering the target interface in the page based on the response data to generate a new target interface, acquiring the webpage source code of the page again; and
and determining a new target interface by comparing the web page source codes of the two acquired pages.
Step S103: and extracting a corresponding analysis field from the newly generated target interface.
In one possible implementation, extracting the corresponding parsing field from the newly generated target interface includes the steps of:
dividing the newly generated target interface into at least one row block;
extracting a corresponding data fingerprint from any one of the at least one row block;
the data fingerprint is broken down into at least one parsed field.
Fig. 2 is a schematic diagram illustrating extraction of a corresponding parsing field from a target interface according to a parsing method in a specific application scenario according to an exemplary embodiment of the present application.
As shown in fig. 2, the newly generated target interface as shown in fig. 2 is divided into at least one row block. As shown in fig. 2, the target interface is divided into a plurality of line blocks.
Specifically, corresponding data fingerprints are sequentially extracted from three line blocks in the first line block, and each data fingerprint is sequentially disassembled into the following parsing fields, "region a", "title a", and "2021-12-15".
Corresponding data fingerprints are sequentially extracted from three line blocks in the second line block, and each data fingerprint is sequentially disassembled into the following analysis fields, namely an area A, a title B and 2021-12-15.
In the parsing method provided by the embodiment of the disclosure, the features of the data fingerprint with the specific identifier include at least one of the following: location feature, title feature, publication time feature, author feature, and jump address link feature.
Besides the characteristic of the specific identification of the common data fingerprint, other characteristics of the specific identification of the data fingerprint can be introduced according to the requirements of different application scenes, and are not repeated here.
Through the steps, when the number of line blocks obtained by dividing the newly generated target interface is large, and the number of analysis fields obtained by disassembling the corresponding data fingerprint is large, the analysis efficiency can be effectively improved.
In one possible implementation, extracting the corresponding parsing field from the newly generated target interface includes the steps of:
Dividing the newly generated target interface into at least one row block;
for each of the at least one row block, extracting a corresponding data fingerprint and breaking down each data fingerprint into a corresponding at least one parsing field.
Through the steps, under the condition that the number of line blocks obtained by dividing the newly generated target interface is limited and the number of analysis fields obtained by disassembling the corresponding data fingerprints is not large, the analysis accuracy can be effectively improved.
Step S104: and analyzing the source code of the response data based on the extracted analysis field to obtain the asynchronous loading rule of the newly generated target interface.
Fig. 3 and fig. 4 are schematic code diagrams corresponding to a process of extracting a parsing field in a specific application scenario according to an exemplary embodiment of the present application.
Based on the extracted parsing fields "region A", "title A" and "2021-12-15" and the parsing fields "region A", "title B" and "2021-12-15" as shown in FIG. 2, the source codes as shown in FIGS. 3 and 4 can be obtained, and the source codes of the response data are parsed to obtain the asynchronous loading rule of the newly generated target interface.
In one possible implementation, resolving the source code of the response data based on the extracted resolving field to obtain an asynchronous loading rule of the newly generated target interface, including the following steps:
Searching the source code of the response data in a preset recursion mode to determine the leaf node of each analysis field in at least one analysis field, thereby obtaining at least one corresponding leaf node;
continuing searching the source code of the response data to determine a common parent node of the at least one leaf node;
determining a data group corresponding to the current line block through a public father node;
determining an asynchronous loading rule of the current line block based on a data set corresponding to the current line block; and
based on the asynchronous loading rule of the current line block, deducing the asynchronous loading rule of the newly generated target interface.
In an actual application scene, searching the source codes of the response data in a preset recursion mode. Specifically, in the case where the response data is JSON data, the leaf nodes recursion upward in sequence until the recursion reaches the root node, the source code of the response data including at least one item of target content is searched to search for at least one key node including at least one item of target content. Under the condition that a plurality of key nodes are searched, all the key nodes are recorded, and a corresponding key node list is automatically generated by all the recorded key nodes, so that the details of all the key nodes related to the target content can be intuitively inquired from the generated key node list.
In one possible implementation, searching the source code of the response data to determine a common parent node of the at least one leaf node is continued, comprising the steps of:
continuing searching the source code of the response data to determine the father node of each leaf node in the at least one leaf node; and
in the case where the parent node of each leaf node is the same node, the same node is a common parent node of at least one leaf node.
In one possible implementation manner, the method for parsing an asynchronous loading rule provided by the embodiment of the present disclosure further includes the following steps:
in the case where the parent nodes of the respective leaf nodes are different nodes, one parent node having the shortest path to the root node among the parent nodes of the respective leaf nodes is determined as a common parent node of at least one leaf node.
In one possible implementation, the following is performed in the case where the response data is JSON data:
extracting a corresponding analysis field from the newly generated target interface; and
and analyzing the source code of the response data based on the extracted analysis field to obtain the asynchronous loading rule of the newly generated target interface.
In one possible implementation, parsing the source code of the response data based on the extracted parsing field to obtain an asynchronous loading rule of the newly generated target interface includes:
Analyzing the source code of the response data based on at least one analysis field extracted from any one of the at least one line block to obtain an asynchronous loading rule of the any one line block;
resolving the source code of the response data based on at least one resolving field extracted from another arbitrary line block in the at least one line block to obtain an asynchronous loading rule of the another arbitrary line block; and
and under the condition that the two asynchronous loading rules are consistent, the asynchronous loading rule of the target interface newly generated by the asynchronous loading rule of any row of blocks is generated.
The comparison result obtained in the comparison process of the two asynchronous loading rules is finally determined to be the asynchronous loading rule of the new target interface only when the comparison result is that the two asynchronous loading rules are consistent, so that the automatic calibration process of the asynchronous loading rules is realized under the condition of automatically analyzing and generating the asynchronous loading rules, and the accuracy of the asynchronous loading rules obtained by analysis can be greatly improved.
In one possible implementation, sending a web page data asynchronous load request includes the steps of:
based on the automatic page turning rule, automatically triggering a corresponding page turning control to send a webpage data asynchronous loading request; in this way, the corresponding page turning control is automatically triggered by the automatic page turning rule, so that the intelligence and the analysis efficiency of the analysis process are improved.
It should be appreciated that in other embodiments, the corresponding switch control may also be automatically triggered based on card switching logic to send the web page data asynchronous load request. Alternatively, in other embodiments, the corresponding filter and select controls may also be automatically triggered based on the drop-down box filter logic to send the web page data asynchronous load request. Alternatively, in other embodiments, the web page data asynchronous load request may also be automatically triggered and sent based on the word casting search.
In one possible implementation, resolving the source code of the response data based on the extracted resolving field to obtain an asynchronous loading rule of the newly generated target interface, including the following steps:
responding to touch operation of a target user on any menu bar of a target interface, and corresponding to a corresponding row block of the target interface according to a preset matching corresponding relation;
and analyzing the source code of the response data based on at least one analysis field extracted from any row of blocks in the corresponding row of blocks to obtain an asynchronous loading rule of any row of blocks.
Through the steps, more types of data can be automatically analyzed, for example, the data comprising the Category Code field can be automatically analyzed, the data can be responded to different types of data pages through touch operation of a target user, the source codes of the response data comprising all key parameters corresponding to the data pages are traced back, and the source codes are analyzed to obtain asynchronous loading rules of any row of blocks.
In an actual application scene, a corresponding relation between any menu bar displayed on the left side of a target interface and a row block including a news related area and a news corresponding title displayed on the right side of the target interface can be preconfigured; thus, the line blocks including the news headlines on the right side of the target interface can be automatically analyzed only in response to the touch operation of the target user on any menu bar of the target interface.
In the embodiment of the disclosure, a webpage data asynchronous loading request is sent; response data returned by the server side aiming at the webpage data asynchronous loading request is obtained, and a new target interface is generated by rendering the target interface in the webpage based on the response data; extracting a corresponding analysis field from the newly generated target interface; and analyzing the source code of the response data based on the extracted analysis field to obtain the asynchronous loading rule of the newly generated target interface. The parsing method provided by the embodiment of the disclosure can parse the source code of the response data based on the extracted parsing field so as to obtain the asynchronous loading rule of the newly generated target interface; therefore, automatic analysis can be realized, and the whole analysis process does not need to be manually participated, so that the analysis efficiency of the asynchronous loading rule is greatly improved; in addition, as the whole process is automatic analysis, possible manual analysis errors are avoided, and therefore the analysis accuracy of the asynchronous loading rule is effectively improved.
In the above embodiment, an parsing method for an asynchronous loading rule is provided, and correspondingly, the present application further provides a parsing device for an asynchronous loading rule. The parsing device of the asynchronous loading rule provided by the embodiment of the disclosure can implement the parsing method of the asynchronous loading rule, and the parsing device of the asynchronous loading rule can be realized by software, hardware or a combination of software and hardware. For example, the parsing means of the asynchronously loaded rule may comprise integrated or separate functional modules or units for performing the corresponding steps in the methods described above.
Referring to fig. 5, a schematic diagram of an apparatus for parsing an asynchronous loading rule according to some embodiments of the present application is shown. Since the apparatus embodiments are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points. The device embodiments described below are merely illustrative.
As shown in fig. 5, the parsing apparatus 500 for asynchronously loading rules may include:
a sending unit 501, configured to send a webpage data asynchronous loading request;
the acquiring and rendering unit 502 is configured to acquire response data returned by the server for the webpage data asynchronous loading request, and render the target interface in the webpage based on the response data to generate a new target interface;
An extracting unit 503, configured to extract a corresponding parsing field from the newly generated target interface; and
and the parsing unit 504 is configured to parse the source code of the response data based on the extracted parsing field, so as to obtain a newly generated asynchronous loading rule of the target interface.
In some implementations of the embodiments of the present disclosure, the sending unit 501 is specifically configured to:
acquiring a webpage data asynchronous loading request sent to a server by a client intercepted by an interceptor; and simulating and sending the intercepted webpage data asynchronous loading request to initiate the webpage data asynchronous loading request to the server.
In some implementations of the disclosed embodiments, the apparatus 500 may further include:
a determining unit (not shown in fig. 5) for determining a new target interface by:
before rendering a target interface in a page based on response data to generate a new target interface, acquiring a webpage source code of the page;
after rendering the target interface in the page based on the response data to generate a new target interface, acquiring the webpage source code of the page again; and
and determining a new target interface by comparing the web page source codes of the two acquired pages.
In some implementations of the disclosed embodiments, the extraction unit 503 is specifically configured to:
dividing the newly generated target interface into at least one row block;
extracting a corresponding data fingerprint from any one of the at least one row block;
the data fingerprint is broken down into at least one parsed field.
In some implementations of the disclosed embodiments, the extraction unit 503 is specifically configured to:
dividing the newly generated target interface into at least one row block;
for each of the at least one row block, extracting a corresponding data fingerprint and breaking down each data fingerprint into a corresponding at least one parsing field.
In some implementations of the disclosed embodiments, the parsing unit 504 is specifically configured to:
searching the source code of the response data in a preset recursion mode to determine the leaf node of each analysis field in at least one analysis field, thereby obtaining at least one corresponding leaf node;
continuing searching the source code of the response data to determine a common parent node of the at least one leaf node;
determining a data group corresponding to the current line block through a public father node;
determining an asynchronous loading rule of the current line block based on a data set corresponding to the current line block; and
Based on the asynchronous loading rule of the current line block, deducing the asynchronous loading rule of the newly generated target interface.
In some implementations of the disclosed embodiments, the parsing unit 504 is specifically configured to:
continuing searching the source code of the response data to determine the father node of each leaf node in the at least one leaf node; and
in the case where the parent node of each leaf node is the same node, the same node is a common parent node of at least one leaf node.
In some implementations of the disclosed embodiments, the apparatus 500 may further include:
a common parent node determining unit (not shown in fig. 5) for determining, as a common parent node of at least one leaf node, a parent node whose path to the root node is shortest among parent nodes of the respective leaf nodes in the case where the parent nodes of the respective leaf nodes are different nodes.
In some implementations of the disclosed embodiments, the extraction unit 503 is specifically configured to:
in the case where the response data is JSON data, the following operations are performed:
extracting a corresponding analysis field from the newly generated target interface;
in some implementations of the disclosed embodiments, the parsing unit 504 is specifically configured to:
In the case where the response data is JSON data, the following operations are performed:
and analyzing the source code of the response data based on the extracted analysis field to obtain the asynchronous loading rule of the newly generated target interface.
In some implementations of the disclosed embodiments, the parsing unit 504 is specifically configured to:
analyzing the source code of the response data based on at least one analysis field extracted from any one of the at least one line block to obtain an asynchronous loading rule of the any one line block;
resolving the source code of the response data based on at least one resolving field extracted from another arbitrary line block in the at least one line block to obtain an asynchronous loading rule of the another arbitrary line block; and under the condition that the two asynchronous loading rules are consistent, the asynchronous loading rule of the target interface newly generated by the asynchronous loading rule of any row of blocks.
In some implementations of the embodiments of the present disclosure, the sending unit 501 is specifically configured to:
based on the automatic page turning rule, the corresponding page turning control is automatically triggered to send a webpage data asynchronous loading request.
In some implementations of the embodiments of the present disclosure, the parsing apparatus 500 for an asynchronous loading rule provided by the embodiments of the present disclosure has the same beneficial effects as the parsing method for an asynchronous loading rule provided by the foregoing embodiments of the present application due to the same inventive concept.
The embodiment of the application also provides an electronic device corresponding to the method for analyzing the asynchronous loading rule provided in the foregoing embodiment, where the electronic device may be an electronic device for a server, for example, a server, including an independent server and a distributed server cluster, so as to execute the method for analyzing the asynchronous loading rule; the electronic device may also be an electronic device for a client, such as a mobile phone, a notebook computer, a tablet computer, a desktop computer, etc., to execute the above analysis method of the asynchronous loading rule.
Referring to fig. 6, a schematic diagram of an electronic device according to some embodiments of the present application is shown. As shown in fig. 6, the electronic device 30 includes: a processor 300, a memory 301, a bus 302 and a communication interface 303, the processor 300, the communication interface 303 and the memory 301 being connected by the bus 302; the memory 301 stores a computer program that can be executed on the processor 300, and when the processor 300 executes the computer program, the message deduplication processing method described in the present application is executed.
The memory 301 may include a high-speed random access memory (RAM: random Access Memory), and may further include a non-volatile memory (non-volatile memory), such as at least one magnetic disk memory. The communication connection between the system network element and at least one other network element is implemented via at least one communication interface 303 (which may be wired or wireless), the internet, a wide area network, a local network, a metropolitan area network, etc. may be used.
Bus 302 may be an ISA bus, a PCI bus, an EISA bus, or the like. The buses may be classified as address buses, data buses, control buses, etc. The memory 301 is configured to store a program, and the processor 300 executes the program after receiving an execution instruction, and the message deduplication processing method disclosed in any of the foregoing embodiments of the disclosure may be applied to the processor 300 or implemented by the processor 300.
The processor 300 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in the processor 300 or by instructions in the form of software. The processor 300 may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but may also be a Digital Signal Processor (DSP), application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The various methods, steps and logic blocks of the disclosure in the embodiments of the disclosure may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present disclosure may be embodied directly in hardware, in a decoded processor, or in a combination of hardware and software modules in a decoded processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in the memory 301, and the processor 300 reads the information in the memory 301, and in combination with its hardware, performs the steps of the above method.
The electronic device provided by the embodiment of the present disclosure and the parsing method of the asynchronous loading rule provided by the embodiment of the present disclosure are the same inventive concept, and have the same beneficial effects as the method adopted, operated or implemented by the same.
The present application further provides a computer readable medium corresponding to the method for parsing an asynchronous loading rule provided in the foregoing embodiment, referring to fig. 7, the computer readable storage medium is shown as an optical disc 70, on which a computer program (i.e. a program product) is stored, where the computer program when executed by a processor performs the foregoing method for message deduplication processing.
It should be noted that examples of the computer readable storage medium may also include, but are not limited to, a phase change memory (PRAM), a Static Random Access Memory (SRAM), a Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a flash memory, or other optical or magnetic storage medium, which will not be described in detail herein.
The computer readable storage medium provided by the above embodiments of the present application and the parsing method of the asynchronous loading rule provided by the embodiments of the present disclosure have the same advantageous effects as the method adopted, operated or implemented by the application program stored therein, because of the same inventive concept.
It is noted that the flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.
In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described apparatus embodiments are merely illustrative, for example, the division of the units is merely a logical function division, and there may be other manners of division in actual implementation, and for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some communication interface, device or unit indirect coupling or communication connection, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the embodiments, and are intended to be included within the scope of the claims and description.

Claims (14)

1. An parsing method of an asynchronous loading rule, comprising:
sending a webpage data asynchronous loading request;
response data returned by the server side aiming at the webpage data asynchronous loading request is obtained, and a new target interface is generated by rendering the target interface in the webpage based on the response data;
extracting a corresponding analysis field from the newly generated target interface; and
analyzing the source code of the response data based on the extracted analysis field to obtain the asynchronous loading rule of the newly generated target interface, wherein the asynchronous loading rule comprises the following steps:
searching the source code of the response data in a preset recursion mode to determine the leaf node of each analysis field in the at least one analysis field, thereby obtaining at least one corresponding leaf node;
continuing searching the source code of the response data to determine a common parent node of the at least one leaf node;
determining a data group corresponding to the current line block through the public father node;
determining an asynchronous loading rule of the current line block based on a data set corresponding to the current line block; and
and deducing the asynchronous loading rule of the newly generated target interface based on the asynchronous loading rule of the current line block.
2. The method of claim 1, wherein sending the web page data asynchronous load request comprises:
acquiring a webpage data asynchronous loading request sent to the server by a client intercepted by an interceptor; and
and simulating and sending the intercepted webpage data asynchronous loading request to initiate the webpage data asynchronous loading request to the server.
3. The method of claim 1, wherein determining the new target interface comprises:
before rendering a target interface in a page based on the response data to generate a new target interface, acquiring webpage source codes of the page;
after rendering the target interface in the page based on the response data to generate a new target interface, acquiring the webpage source code of the page again; and
and determining the new target interface by comparing the web page source codes of the two acquired pages.
4. The method of claim 1, wherein extracting the corresponding parsing field from the newly generated target interface comprises:
dividing the newly generated target interface into at least one row block;
extracting a corresponding data fingerprint from any one of the at least one row block;
And disassembling the data fingerprint into at least one analysis field.
5. The method of claim 1, wherein extracting the corresponding parsing field from the newly generated target interface comprises:
dividing the newly generated target interface into at least one row block;
for each of the at least one row block, extracting a corresponding data fingerprint and breaking down each data fingerprint into a corresponding at least one parsing field.
6. The method of claim 1, wherein continuing to search the source code of the response data to determine a common parent node of the at least one leaf node comprises:
continuing searching the source code of the response data to determine the father node of each leaf node in the at least one leaf node; and in the case that the parent nodes of the respective leaf nodes are the same node, taking the same node as a common parent node of the at least one leaf node.
7. The method of claim 6, further comprising:
and in the case that the father node of each leaf node is a different node, determining the father node with the shortest path to the root node in the father nodes of each leaf node as the common father node of the at least one leaf node.
8. The method of claim 1, wherein the following is performed if the response data is JSON data:
extracting a corresponding analysis field from the newly generated target interface; and
and analyzing the source code of the response data based on the extracted analysis field to obtain the asynchronous loading rule of the newly generated target interface.
9. The method of claim 5, wherein parsing the source code of the response data based on the extracted parsing field to obtain the asynchronous loading rules of the newly generated target interface comprises:
analyzing the source code of the response data based on at least one analysis field extracted from any one of the at least one line block to obtain an asynchronous loading rule of the any one line block;
resolving the source code of the response data based on at least one resolving field extracted from another arbitrary line block in the at least one line block to obtain an asynchronous loading rule of the another arbitrary line block; and under the condition that the two asynchronous loading rules are consistent, the asynchronous loading rule of any row of blocks is used for generating the asynchronous loading rule of the target interface newly.
10. The method of claim 1, wherein sending the web page data asynchronous load request comprises:
based on the automatic page turning rule, the corresponding page turning control is automatically triggered to send a webpage data asynchronous loading request.
11. An apparatus for parsing an asynchronously loaded rule, comprising:
the sending unit is used for sending a webpage data asynchronous loading request;
the acquiring and rendering unit is used for acquiring response data returned by the server side aiming at the webpage data asynchronous loading request, and rendering a target interface in a page based on the response data so as to generate a new target interface;
the extraction unit is used for extracting a corresponding analysis field from the newly generated target interface; and
the parsing unit is configured to parse the source code of the response data based on the extracted parsing field, so as to obtain an asynchronous loading rule of the newly generated target interface, where the parsing unit includes:
searching the source code of the response data in a preset recursion mode to determine the leaf node of each analysis field in the at least one analysis field, thereby obtaining at least one corresponding leaf node;
continuing searching the source code of the response data to determine a common parent node of the at least one leaf node;
Determining a data group corresponding to the current line block through the public father node;
determining an asynchronous loading rule of the current line block based on a data set corresponding to the current line block; and
and deducing the asynchronous loading rule of the newly generated target interface based on the asynchronous loading rule of the current line block.
12. An electronic device, the electronic device comprising:
a processor;
a memory for storing the processor-executable instructions;
the processor being configured to read the executable instructions from the memory and execute the executable instructions to implement the method of any one of the preceding claims 1 to 10.
13. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program for executing the method of any of the preceding claims 1 to 10.
14. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1 to 10.
CN202210177370.7A 2022-02-25 2022-02-25 Analysis method and device of asynchronous loading rule, storage medium and electronic equipment Active CN114611039B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210177370.7A CN114611039B (en) 2022-02-25 2022-02-25 Analysis method and device of asynchronous loading rule, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210177370.7A CN114611039B (en) 2022-02-25 2022-02-25 Analysis method and device of asynchronous loading rule, storage medium and electronic equipment

Publications (2)

Publication Number Publication Date
CN114611039A CN114611039A (en) 2022-06-10
CN114611039B true CN114611039B (en) 2024-02-20

Family

ID=81858940

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210177370.7A Active CN114611039B (en) 2022-02-25 2022-02-25 Analysis method and device of asynchronous loading rule, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN114611039B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115563427B (en) * 2022-07-20 2023-07-18 合肥汉泰网络科技有限公司 Construction method, system and computer equipment of responsive asynchronous portal

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109033115A (en) * 2017-06-12 2018-12-18 广东技术师范学院 A kind of dynamic web page crawler system
CN111294395A (en) * 2020-01-20 2020-06-16 广东金赋科技股份有限公司 Terminal page transmission method, device, medium and electronic equipment
CN112433784A (en) * 2020-12-10 2021-03-02 东莞市盟大塑化科技有限公司 Page loading method, device, equipment and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109344351B (en) * 2018-10-18 2021-01-05 网宿科技股份有限公司 Webpage loading method, intermediate server and webpage loading system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109033115A (en) * 2017-06-12 2018-12-18 广东技术师范学院 A kind of dynamic web page crawler system
CN111294395A (en) * 2020-01-20 2020-06-16 广东金赋科技股份有限公司 Terminal page transmission method, device, medium and electronic equipment
CN112433784A (en) * 2020-12-10 2021-03-02 东莞市盟大塑化科技有限公司 Page loading method, device, equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Web异步加载技术分析及信息爬取策略实现;杜润泽;梁英;方英兰;;电脑知识与技术;20180825(第24期);全文 *
基于Ajax的移动端异步交互新闻系统开发;李颖;唐冶;李松林;;巢湖学院学报;20161125(第06期);全文 *

Also Published As

Publication number Publication date
CN114611039A (en) 2022-06-10

Similar Documents

Publication Publication Date Title
CN109284321B (en) Data loading method, device, computing equipment and computer readable storage medium
CN111367595B (en) Data processing method, program running method, device and processing equipment
CN110580189A (en) method and device for generating front-end page, computer equipment and storage medium
CN111159329A (en) Sensitive word detection method and device, terminal equipment and computer-readable storage medium
CN110941779A (en) Page loading method and device, storage medium and electronic equipment
CN114611039B (en) Analysis method and device of asynchronous loading rule, storage medium and electronic equipment
CN109815243B (en) Structured storage method and device during document interface modification
CN109213782B (en) Search interface configuration and display method and device and communication equipment
CN112559913B (en) Data processing method, device, computing equipment and readable storage medium
CN110442439B (en) Task process processing method and device and computer equipment
CN113760891A (en) Data table generation method, device, equipment and storage medium
CN111176901A (en) HDFS deleted file recovery method, terminal device and storage medium
CN111382189A (en) Heterogeneous data collision analysis method, terminal device and storage medium
CN108460116B (en) Search method, search device, computer equipment, storage medium and search system
CN110674386B (en) Resource recommendation method, device and storage medium
CN109739883B (en) Method and device for improving data query performance and electronic equipment
CN114490510A (en) Text stream filing method and device, computer equipment and storage medium
CN109491699B (en) Resource checking method, device, equipment and storage medium of application program
CN113434796A (en) Page cache operation method and device, storage medium and electronic device
CN113641523A (en) Log processing method and device
CN111858158A (en) Data processing method and device and electronic equipment
CN111143526A (en) Method and device for generating and controlling configuration information of consultation service control
WO2020065778A1 (en) Information processing device, control method, and program
CN110750739B (en) Page type determination method and device
CN105683958B (en) Text sample entry group formulation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20230731

Address after: Room 404-405, 504, Building B-17-1, Big data Industrial Park, Kecheng Street, Yannan High tech Zone, Yancheng, Jiangsu Province, 224000

Applicant after: Yancheng Tianyanchawei Technology Co.,Ltd.

Address before: 224000 room 501-503, building b-17-1, Xuehai road big data Industrial Park, Kecheng street, Yannan high tech Zone, Yancheng City, Jiangsu Province

Applicant before: Yancheng Jindi Technology Co.,Ltd.

GR01 Patent grant
GR01 Patent grant