CN113076457A - Crawler action processing method and device - Google Patents

Crawler action processing method and device Download PDF

Info

Publication number
CN113076457A
CN113076457A CN202110381951.8A CN202110381951A CN113076457A CN 113076457 A CN113076457 A CN 113076457A CN 202110381951 A CN202110381951 A CN 202110381951A CN 113076457 A CN113076457 A CN 113076457A
Authority
CN
China
Prior art keywords
bookmark
crawler
action
data processing
preset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110381951.8A
Other languages
Chinese (zh)
Inventor
梁益欣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Aerospace Information Guangdong Co ltd
Original Assignee
Aerospace Information Guangdong Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Aerospace Information Guangdong Co ltd filed Critical Aerospace Information Guangdong Co ltd
Priority to CN202110381951.8A priority Critical patent/CN113076457A/en
Publication of CN113076457A publication Critical patent/CN113076457A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • G06F16/9562Bookmark management

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a crawler action processing method and a crawler action processing device, wherein the method comprises the following steps: acquiring a preset crawler action bookmark; calling the preset crawler action bookmark by utilizing a crawler processing program, and performing data processing corresponding to the crawler action bookmark to obtain a data processing result; and inputting data to each service system based on the data processing result. The crawler action processing method and the crawler action processing device can improve the crawler action processing effect.

Description

Crawler action processing method and device
Technical Field
The invention relates to the field of computers, in particular to a crawler action processing method and device.
Background
Currently, each service system needs to store data in a structured form, so when data is input, format conversion is manually performed on original data, and then the original data is stored in each service system.
The crawler crawls original data and then realizes the input of the original data. However, the processing of the crawler action requires a programmer to write the crawler action into the program code, and when the crawler action needs to be updated, the crawler action depends on the programmer familiar with the business logic to perform manual update in the program code, so that the processing effect of the crawler action is poor.
In view of the above problems, no effective solution has been proposed.
Disclosure of Invention
The embodiment of the invention provides a crawler action processing method and device, which are used for at least improving the crawler action processing effect.
According to an aspect of an embodiment of the present invention, there is provided a crawler action processing method, including: acquiring a preset crawler action bookmark; calling the preset crawler action bookmark by utilizing a crawler processing program, and performing data processing corresponding to the crawler action bookmark to obtain a data processing result; and inputting data to each service system based on the data processing result.
As an optional implementation manner, the preset crawler action bookmark is set by the following steps: acquiring bookmark fields corresponding to all bookmark field types; and determining the preset crawler action bookmark based on the bookmark position corresponding to each bookmark field type and each obtained bookmark field.
As an optional implementation, the bookmark field category includes at least any combination of the following field categories: bookmark identification, element positioning mode, element positioning configuration, element type and cell information.
As an optional implementation manner, the invoking, by using a crawler processing program, the preset crawler action bookmark, and performing data processing corresponding to the crawler action bookmark to obtain a data processing result includes: identifying each bookmark field of the preset crawler action bookmark by using the crawler processing program; determining original data based on bookmark fields corresponding to the element positioning modes; and configuring the original data based on the bookmark fields corresponding to the element positioning configuration to obtain a data processing result.
As an optional implementation manner, the configuring, based on the element positioning, the corresponding bookmark field includes: analyzing bookmark fields corresponding to the element positioning configuration based on a preset equivalent substituted character table to obtain analyzed fields; and configuring the original data based on the analyzed field.
As an optional implementation manner, the invoking, by using a crawler processing program, the preset crawler action bookmark, and performing data processing corresponding to the crawler action bookmark to obtain a data processing result includes: and in the process of calling the preset crawler action bookmark by using the crawler processing program, in response to the detection that target characters exist in the preset crawler action bookmark, executing data processing corresponding to the target characters to obtain a data processing result.
According to another aspect of the embodiments of the present invention, there is also provided a crawler action processing apparatus, including: a bookmark acquisition unit configured to acquire a preset crawler action bookmark; the bookmark calling unit is configured to call the preset crawler action bookmark by using a crawler processing program, and perform data processing corresponding to the crawler action bookmark to obtain a data processing result; and the data entry unit is configured to enter data into each business system based on the data processing result.
As an optional implementation manner, the preset crawler action bookmark is set by the following steps: acquiring bookmark fields corresponding to all bookmark field types; and determining the preset crawler action bookmark based on the bookmark position corresponding to each bookmark field type and each obtained bookmark field.
As an optional implementation, the bookmark field category includes at least any combination of the following field categories: bookmark identification, element positioning mode, element positioning configuration, element type and cell information.
As an optional implementation, the bookmark invoking unit is further configured to: identifying each bookmark field of the preset crawler action bookmark by using the crawler processing program; determining original data based on bookmark fields corresponding to the element positioning modes; and configuring the original data based on the bookmark fields corresponding to the element positioning configuration to obtain a data processing result.
As an optional implementation, the bookmark invoking unit is further configured to: analyzing bookmark fields corresponding to the element positioning configuration based on a preset equivalent substituted character table to obtain analyzed fields; and configuring the original data based on the analyzed field to obtain a data processing result.
As an optional implementation, the bookmark invoking unit is further configured to: and in the process of calling the preset crawler action bookmark by using the crawler processing program, in response to the detection that target characters exist in the preset crawler action bookmark, executing data processing corresponding to the target characters to obtain a data processing result.
According to still another aspect of the embodiments of the present invention, there is also provided a computer-readable storage medium, in which a computer program is stored, wherein the computer program is configured to execute the above crawler action processing method when running.
According to another aspect of the embodiments of the present invention, there is also provided an electronic apparatus, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the above crawler action processing method through the computer program.
In the embodiment of the invention, the preset crawler action bookmark can be obtained, and the data processing corresponding to the crawler action bookmark can be realized by directly calling the preset crawler action bookmark by operating the crawler processing program, so that the data can be input into each service system. When the crawler action needs to be updated, the corresponding crawler action bookmark only needs to be modified, and extra processing is not needed to be carried out on the crawler processing program, so that the crawler action processing effect can be improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
FIG. 1 is a flow diagram of an alternative crawler action processing method according to an embodiment of the present invention;
FIG. 2 is a flow diagram of another alternative crawler action processing method according to an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of an alternative crawler action processing apparatus according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of an alternative electronic device according to an embodiment of the invention;
FIG. 5 is a schematic diagram of an alternative crawler action bookmark according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
An embodiment of the present invention provides an optional crawler action processing method, and as shown in fig. 1, the crawler action processing method includes:
s101, acquiring a preset crawler action bookmark.
In this embodiment of the present invention, the execution subject may be a terminal device or a server, which is not limited in this embodiment. Taking the execution subject as the terminal device as an example, the technical solution in the embodiment of the present invention may be applied to a browser client, a PC client, and the like in the terminal device, which is not limited in this embodiment. The client sides of the terminal equipment can operate a service system for storing data, call preset crawler action bookmarks through a crawler processing program to execute corresponding crawler actions, realize corresponding data processing, and thus input data into the service system of each client side. The crawler action bookmark can be a bookmark in which a crawler action is defined in a word bookmark. These crawler action bookmarks may be custom set by the user. In the case that crawler actions need to be updated, only the user needs to update the crawler action bookmarks, and no processing is needed for the crawler processing program.
And S102, calling the preset crawler action bookmark by using a crawler processing program, and performing data processing corresponding to the crawler action bookmark to obtain a data processing result.
In the embodiment of the present invention, the crawler processing program may be a program for automatically filling the content in the word form in each service system. And the crawler action executed by the crawler processing program is determined by calling a preset crawler action bookmark. The crawler actions at least comprise actions of crawler data copying, crawler data entry and the like. Further, the crawler action bookmark may include different bookmark field categories including at least any combination of the following field categories: bookmark identification, element positioning mode, element positioning configuration, element type and cell information. The bookmark identification is an identification for describing the use of the bookmark, such as describing that the bookmark belongs to a system command, system maintenance or data copy. The element-positioned manner is a manner for determining the data element, for example, based on the path information. Element-oriented configuration is a way to configure the manner in which an element is oriented, such as configuring a data acquisition path. The element type is used to define the data type of the data element and may include, but is not limited to, string type, decimal type, positive decimal type, and the like. The cell information is used to describe the table information where the data is located, and may include, but is not limited to, identification of the table, row number, column number, and the like. And, after performing data processing corresponding to the crawler action bookmark, a data processing result may be generated. The data processing result is used for indicating the crawler action category which is respectively executed on each data.
It should be noted that each bookmark field category in the crawler action bookmark may be divided by using a preset division symbol, and for a field corresponding to each bookmark field category, if the field includes a plurality of subfields, the field may also be divided by using a preset division symbol. When the crawler processing program is used for calling the crawler action bookmarks, all bookmark field categories can be quickly positioned based on the segmentation symbols, so that the data processing configuration information can be conveniently obtained, and the data processing efficiency is improved.
As an optional implementation manner, the invoking, by using a crawler processing program, the preset crawler action bookmark, and performing data processing corresponding to the crawler action bookmark to obtain a data processing result includes: and in the process of calling the preset crawler action bookmark by using the crawler processing program, in response to the detection that target characters exist in the preset crawler action bookmark, executing data processing corresponding to the target characters to obtain a data processing result.
In the present embodiment, the target characters are preset characters, and the characters are preset with action information corresponding to the characters. The execution body stores in advance a correspondence between the target character and the action information. Wherein, the corresponding relation between the target character and the action information is shown in a first table:
watch 1
Figure BDA0003013310910000051
Figure BDA0003013310910000061
Figure BDA0003013310910000071
As can be seen from Table I, if the target character exists in the crawler action bookmark, the crawler processing program can execute the corresponding data processing operation based on the target character. If the pop-up selection box is in man-machine interaction with the user, a data processing result is generated according to the content selected by the user, or the corresponding content in the crawler action bookmark is determined as the designated information, and the data processing result is obtained based on the designated information content.
And S103, recording data to each service system based on the data processing result.
In the embodiment of the present invention, after the data processing result is obtained, data analysis may be performed on the data processing result to determine data that needs to be entered into each service system, so as to complete the entry of data into the service system.
In the embodiment of the invention, the preset crawler action bookmark can be obtained, and the data processing corresponding to the crawler action bookmark can be realized by directly calling the preset crawler action bookmark by operating the crawler processing program, so that the data can be input into each service system. When the crawler action needs to be updated, the corresponding crawler action bookmark only needs to be modified, and extra processing is not needed to be carried out on the crawler processing program, so that the crawler action processing effect can be improved.
Further, an embodiment of the present invention provides another optional crawler action processing method, as shown in fig. 2, where the crawler action processing method includes:
s201, acquiring bookmark fields corresponding to all bookmark field types.
In the embodiment of the present invention, the bookmark field categories at least include any combination of the following field categories: bookmark identification, element positioning mode, element positioning configuration, element type and cell information. The bookmark identification is an identification for describing the use of the bookmark, such as describing that the bookmark belongs to a system command, system maintenance or data copy. The element-positioned manner is a manner for determining the data element, for example, based on the path information. Element-oriented configuration is a way to configure the manner in which an element is oriented, such as configuring a data acquisition path. The element type is used to define the data type of the data element and may include, but is not limited to, string type, decimal type, positive decimal type, and the like. The cell information is used to describe the table information where the data is located, and may include, but is not limited to, identification of the table, row number, column number, and the like.
Referring to fig. 5, fig. 5 is a schematic diagram of an optional crawler action bookmark according to an embodiment of the present invention, as shown in fig. 5, fig. 5 shows bookmark field categories of each crawler action bookmark from left to right, and bookmark identifications, element positioning modes, element positioning configurations, element types, and cell information from left to right. Specifically, the bookmark identification PO _ belongs to a bookmark field that can be omitted, and when the word bookmark belongs to the purposes of system command (control command), maintenance, data copy, and the like, the bookmark identification can be omitted, and if the word bookmark belongs to some special purposes, a control for filling in the content by the user is popped up, and the identification filled in by the user in a customized manner is received. An element positioning mode coding table is preset for the element positioning mode, and the specific table is as follows:
watch two
Figure BDA0003013310910000081
Figure BDA0003013310910000091
As can be seen from table two, in the bookmark field corresponding to the element locating mode, different codes can be used to represent different element locating modes, and for the case where there are multiple eligible elements, there is a corresponding element selection policy, and in the special case where the search ID is 0, there is a corresponding special element selection policy. It should be noted that the mini _ xpath may be an element positioning mode obtained by simplifying and abbreviating elements in the xpath. Alternatively, if multiple element positioning manners are required, the element positioning manners may be performed in a predetermined order, such as in the order of p → i → n → c → x. And if the bookmark field corresponding to the element positioning mode is not filled with the field, the element positioning mode corresponding to the code x is used by default.
As an optional implementation, a special element positioning manner may be customized according to the needs of the user, as shown in table three:
watch III
Encoding Special handling
z Ignore terms (e.g. auto-compute)
b Remarks (popup window display content needing manual prompt)
s System reserved field, typically for setting boot commands and maintenance commands
As shown in Table three, the element positioning codes corresponding to some special operations can be customized to perform some special data processing.
Further, the bookmark field corresponding to the element location configuration may correspond to the element location mode field, and the specific correspondence relationship refers to table four:
watch four
Figure BDA0003013310910000101
As can be seen from table four, the element positioning configuration can configure the data elements determined by the corresponding element positioning modes.
Further, the element type may represent the corresponding bookmark field using the code in table five, which is as follows:
watch five
Figure BDA0003013310910000102
Figure BDA0003013310910000111
As can be seen from Table five, different element types can be represented by codes, so that crawling of different types of data can be realized according to different codes.
As an alternative embodiment, some special element type fields can be customized to meet the requirements of the user for element clicking action, new page action and the like. Please refer to table six specifically:
watch six
Figure BDA0003013310910000112
Further, the configuration for the cell information may refer to table seven:
watch seven
Figure BDA0003013310910000121
As can be seen from table seven, the cell information may include three sub-parts, which respectively represent table, row, and column IDs, and optionally, the cell information may be abbreviated in a manner shown in table eight, where table eight is as follows:
table eight
Figure BDA0003013310910000122
Figure BDA0003013310910000131
S202, determining the preset crawler action bookmark based on the bookmark position corresponding to each bookmark field type and each obtained bookmark field.
In the embodiment of the invention, bookmark fields corresponding to all the bookmark field categories are integrated according to corresponding bookmark positions to obtain the preset crawler action bookmark. For example, in fig. 5, the bookmark identifier, the element location mode, the element location configuration, the element type, and the cell information are integrated according to the sequence from left to right, so as to obtain the preset crawler action tag.
S203, acquiring a preset crawler action bookmark.
In the embodiment of the present invention, for the detailed description of step S203, refer to the detailed description of step S101, which is not repeated herein.
And S204, identifying each bookmark field of the preset crawler action bookmark by using the crawler processing program.
In the embodiment of the invention, each bookmark field of the crawler action bookmark can be provided with a preset segmentation symbol, and each bookmark field can be obtained based on the division of the preset segmentation symbol.
S205, determining original data based on the bookmark field corresponding to the element positioning mode.
In the embodiment of the present invention, the content of the corresponding element location mode may be determined according to the bookmark field corresponding to the element location mode, that is, the code in table two, and the original data element may be determined based on the content of the element location mode. For example, the corresponding raw data element is looked up by id.
S206, configuring the original data based on the bookmark fields corresponding to the element positioning configuration to obtain a data processing result.
In the embodiment of the present invention, after the original data is determined, the execution main body may determine the element positioning configuration content, for example, the configuration id attribute value, according to the bookmark field corresponding to the element positioning configuration and the corresponding relationship in table four, and the obtained data processing result at this time is the configured id and the id attribute value.
As an optional implementation manner, the configuring, based on the bookmark field corresponding to the element positioning configuration, the raw data to obtain a data processing result includes: analyzing bookmark fields corresponding to the element positioning configuration based on a preset equivalent substituted character table to obtain analyzed fields; and configuring the original data based on the analyzed field to obtain the data processing result.
In the embodiment of the invention, when configuring the bookmark field corresponding to the element positioning configuration, the original character (such as English character) which is difficult to recognize can be replaced based on the equivalent quantity replacing character table, and when analyzing the bookmark field, the bookmark field corresponding to the element positioning configuration can be analyzed based on the equivalent quantity replacing character table to obtain the analyzed field. Wherein, the equivalent substituted character table is as shown in table nine:
watch nine
Figure BDA0003013310910000141
Figure BDA0003013310910000151
And S207, recording data to each service system based on the data processing result.
In the embodiment of the present invention, for the detailed description of step S207, refer to the detailed description of step S103, which is not repeated herein.
In the embodiment of the invention, the preset crawler action bookmark can be obtained, and the data processing corresponding to the crawler action bookmark can be realized by directly calling the preset crawler action bookmark by operating the crawler processing program, so that the data can be input into each service system. When the crawler action needs to be updated, the corresponding crawler action bookmark only needs to be modified, and extra processing is not needed to be carried out on the crawler processing program, so that the crawler action processing effect can be improved.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.
According to another aspect of the embodiment of the invention, a crawler action processing device for implementing the crawler action processing method is also provided. As shown in fig. 3, the apparatus includes:
a bookmark acquiring unit 301 configured to acquire a preset crawler action bookmark.
The bookmark calling unit 302 is configured to call the preset crawler action bookmark by using a crawler processing program, and perform data processing corresponding to the crawler action bookmark to obtain a data processing result.
A data entry unit 303 configured to enter data to each business system based on the data processing result.
As an optional implementation manner, the preset crawler action bookmark is set by the following steps: acquiring bookmark fields corresponding to all bookmark field types; and determining the preset crawler action bookmark based on the bookmark position corresponding to each bookmark field type and each obtained bookmark field.
As an optional implementation, the bookmark field category includes at least any combination of the following field categories: bookmark identification, element positioning mode, element positioning configuration, element type and cell information.
As an optional implementation, the bookmark invoking unit 302 is further configured to: identifying each bookmark field of the preset crawler action bookmark by using the crawler processing program; determining original data based on bookmark fields corresponding to the element positioning modes; and configuring the original data based on the bookmark fields corresponding to the element positioning configuration to obtain a data processing result.
As an optional implementation, the bookmark invoking unit 302 is further configured to: analyzing bookmark fields corresponding to the element positioning configuration based on a preset equivalent substituted character table to obtain analyzed fields; and configuring the original data based on the analyzed field to obtain a data processing result.
As an optional implementation, the bookmark invoking unit 302 is further configured to: and in the process of calling the preset crawler action bookmark by using the crawler processing program, in response to the detection that target characters exist in the preset crawler action bookmark, executing data processing corresponding to the target characters to obtain a data processing result.
In the embodiment of the invention, the preset crawler action bookmark can be obtained, and the data processing corresponding to the crawler action bookmark can be realized by directly calling the preset crawler action bookmark by operating the crawler processing program, so that the data can be input into each service system. When the crawler action needs to be updated, the corresponding crawler action bookmark only needs to be modified, and extra processing is not needed to be carried out on the crawler processing program, so that the crawler action processing effect can be improved.
According to yet another aspect of the embodiments of the present invention, there is also provided an electronic device for implementing the above crawler action processing method, as shown in fig. 4, the electronic device includes a memory 402 and a processor 404, the memory 402 stores therein a computer program, and the processor 404 is configured to execute the steps in any one of the above method embodiments through the computer program.
Optionally, in this embodiment, the electronic device may be located in at least one network device of a plurality of network devices of a computer network.
Optionally, in this embodiment, the processor may be configured to execute the following steps by a computer program:
s1, acquiring a preset crawler action bookmark;
s2, calling the preset crawler action bookmark by using a crawler processing program, and performing data processing corresponding to the crawler action bookmark to obtain a data processing result;
and S3, recording data to each business system based on the data processing result.
Alternatively, it can be understood by those skilled in the art that the structure shown in fig. 4 is only an illustration, and the electronic device may also be a terminal device such as a smart phone (e.g., an Android phone, an iOS phone, etc.), a tablet computer, a palmtop computer, and a Mobile Internet Device (MID), a PAD, and the like. Fig. 4 is a diagram illustrating a structure of the electronic device. For example, the electronic device may also include more or fewer components (e.g., network interfaces, etc.) than shown in FIG. 4, or have a different configuration than shown in FIG. 4.
The memory 402 may be used to store software programs and modules, such as program instructions/modules corresponding to the crawler action processing method and apparatus in the embodiment of the present invention, and the processor 404 executes various functional applications and data processing by running the software programs and modules stored in the memory 402, that is, implements the crawler action processing method. The memory 402 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 402 may further include memory located remotely from the processor 404, which may be connected to the terminal over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof. The memory 402 may be used for storing information such as operation instructions, but is not limited thereto. As an example, as shown in fig. 4, the memory 402 may include, but is not limited to, a bookmark obtaining unit 301, a bookmark calling unit 302, and a data entry unit 303 in the crawler action processing apparatus. In addition, the crawler action processing device may further include, but is not limited to, other module units in the crawler action processing device, which is not described in detail in this example.
Optionally, the transmission device 406 is used for receiving or sending data via a network. Examples of the network may include a wired network and a wireless network. In one example, the transmission device 406 includes a Network adapter (NIC) that can be connected to a router via a Network cable and other Network devices to communicate with the internet or a local area Network. In one example, the transmission device 406 is a Radio Frequency (RF) module, which is used to communicate with the internet in a wireless manner.
In addition, the electronic device further includes: a display 408 for displaying the display content; and a connection bus 410 for connecting the respective module parts in the above-described electronic apparatus.
According to a further aspect of embodiments of the present invention, there is also provided a storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the above-mentioned method embodiments when executed.
Alternatively, in the present embodiment, the storage medium may be configured to store a computer program for executing the steps of:
s1, acquiring a preset crawler action bookmark;
s2, calling the preset crawler action bookmark by using a crawler processing program, and performing data processing corresponding to the crawler action bookmark to obtain a data processing result;
and S3, recording data to each business system based on the data processing result.
Alternatively, in this embodiment, a person skilled in the art may understand that all or part of the steps in the methods of the foregoing embodiments may be implemented by a program instructing hardware associated with the terminal device, where the program may be stored in a computer-readable storage medium, and the storage medium may include: flash disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
The integrated unit in the above embodiments, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in the above computer-readable storage medium. Based on such understanding, the technical solution of the present invention may be substantially or partially implemented in the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, and including instructions for causing one or more computer devices (which may be personal computers, servers, or network devices) to execute all or part of the steps of the method according to the embodiments of the present invention.
In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the several embodiments provided in the present application, it should be understood that the disclosed client may be implemented in other manners. The above-described embodiments of the apparatus are merely illustrative, and for example, a division of a unit is merely a division of a logic function, and an actual implementation may have another division, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that it is obvious to those skilled in the art that various modifications and improvements can be made without departing from the principle of the present invention, and these modifications and improvements should also be considered as the protection scope of the present invention.

Claims (10)

1. A crawler action processing method is characterized by comprising the following steps:
acquiring a preset crawler action bookmark;
calling the preset crawler action bookmark by utilizing a crawler processing program, and performing data processing corresponding to the crawler action bookmark to obtain a data processing result;
and inputting data to each service system based on the data processing result.
2. The method of claim 1, wherein the preset crawler action bookmark is set by:
acquiring bookmark fields corresponding to all bookmark field types;
and determining the preset crawler action bookmark based on the bookmark position corresponding to each bookmark field type and each obtained bookmark field.
3. The method of claim 2, wherein the bookmark field categories include at least any combination of the following field categories: bookmark identification, element positioning mode, element positioning configuration, element type and cell information.
4. The method according to claim 3, wherein the using a crawler processing program to call the preset crawler action bookmark and perform data processing corresponding to the crawler action bookmark to obtain a data processing result comprises:
identifying each bookmark field of the preset crawler action bookmark by using the crawler processing program;
determining original data based on bookmark fields corresponding to the element positioning modes;
and configuring the original data based on the bookmark fields corresponding to the element positioning configuration to obtain a data processing result.
5. The method of claim 4, wherein configuring the raw data based on the bookmark field corresponding to the element positioning configuration, and obtaining a data processing result comprises:
analyzing bookmark fields corresponding to the element positioning configuration based on a preset equivalent substituted character table to obtain analyzed fields;
and configuring the original data based on the analyzed field to obtain the data processing result.
6. The method according to claim 1, wherein the using a crawler processing program to call the preset crawler action bookmark and perform data processing corresponding to the crawler action bookmark to obtain a data processing result includes:
and in the process of calling the preset crawler action bookmark by using the crawler processing program, in response to the detection that target characters exist in the preset crawler action bookmark, executing data processing corresponding to the target characters to obtain a data processing result.
7. A crawler action processing apparatus, comprising:
a bookmark acquisition unit configured to acquire a preset crawler action bookmark;
the bookmark calling unit is configured to call the preset crawler action bookmark by using a crawler processing program, and perform data processing corresponding to the crawler action bookmark to obtain a data processing result;
and the data entry unit is configured to enter data into each business system based on the data processing result.
8. The apparatus of claim 7, wherein the preset crawler action bookmark is set by:
acquiring bookmark fields corresponding to all bookmark field types;
and determining the preset crawler action bookmark based on the bookmark position corresponding to each bookmark field type and each obtained bookmark field.
9. The apparatus of claim 8, wherein the bookmark field categories include at least any combination of the following field categories: bookmark identification, element positioning mode, element positioning configuration, element type and cell information.
10. The apparatus of claim 9, wherein the bookmark invoking unit is further configured to:
identifying each bookmark field of the preset crawler action bookmark by using the crawler processing program;
determining original data based on bookmark fields corresponding to the element positioning modes;
and configuring the original data based on the bookmark fields corresponding to the element positioning configuration to obtain a data processing result.
CN202110381951.8A 2021-04-09 2021-04-09 Crawler action processing method and device Pending CN113076457A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110381951.8A CN113076457A (en) 2021-04-09 2021-04-09 Crawler action processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110381951.8A CN113076457A (en) 2021-04-09 2021-04-09 Crawler action processing method and device

Publications (1)

Publication Number Publication Date
CN113076457A true CN113076457A (en) 2021-07-06

Family

ID=76615930

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110381951.8A Pending CN113076457A (en) 2021-04-09 2021-04-09 Crawler action processing method and device

Country Status (1)

Country Link
CN (1) CN113076457A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140280010A1 (en) * 2013-03-15 2014-09-18 Western Digital Technologies, Inc. Shared media crawler database method and system
US20160227034A1 (en) * 2015-01-06 2016-08-04 Cyara Solutions Pty Ltd Interactive voice response system crawler
CN110188258A (en) * 2019-04-19 2019-08-30 平安科技(深圳)有限公司 The method and device of external data is obtained using crawler

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140280010A1 (en) * 2013-03-15 2014-09-18 Western Digital Technologies, Inc. Shared media crawler database method and system
US20160227034A1 (en) * 2015-01-06 2016-08-04 Cyara Solutions Pty Ltd Interactive voice response system crawler
CN110188258A (en) * 2019-04-19 2019-08-30 平安科技(深圳)有限公司 The method and device of external data is obtained using crawler

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"https://www.cnblogs.com/csj2018/p/9194618.html", Retrieved from the Internet <URL:https://www.cnblogs.com/csj2018/p/9194618.html> *
韩前进等: "Web在线爬虫的设计与实现", 软件, vol. 39, no. 9, pages 86 - 92 *

Similar Documents

Publication Publication Date Title
CN105094707A (en) Method and device for storing and reading data
CN103763361A (en) Method and system for recommending applications based on user behavior and recommending server
CN108228551B (en) excel data import method, device, equipment and computer readable storage medium
CN105138340A (en) Interaction method and system for Native and Web pages
CN107133263B (en) POI recommendation method, device, equipment and computer readable storage medium
CN108399072A (en) Five application page update method and device
CN104424263A (en) Data recording method and data recording device
CN110515951A (en) A kind of BOM standardized method, system and electronic equipment and storage medium
CN111797594A (en) Character string processing method based on artificial intelligence and related equipment
CN111209374A (en) Data query display method and device, computer system and readable storage medium
CN110795697A (en) Logic expression obtaining method and device, storage medium and electronic device
CN108885544B (en) Front-end page internationalized processing method, application server and computer-readable storage medium
CN103593406A (en) Static resource identifier processing method and device
CN112650909A (en) Product display method and device, electronic equipment and storage medium
CN107463669B (en) Method and device for analyzing webpage data crawled by crawler
CN110245281B (en) Internet asset information collection method and terminal equipment
CN101727505B (en) Efficient data processing method and device
CN110647577A (en) Data cube partitioning method and device, computer equipment and storage medium
CN113407254A (en) Form generation method and device, electronic equipment and storage medium
CN111625567A (en) Data model matching method, device, computer system and readable storage medium
CN113076457A (en) Crawler action processing method and device
US7487227B2 (en) Scripting engine having a sequencer table and a plurality of secondary tables for network communication software
CN108984221B (en) Method and device for acquiring multi-platform user behavior logs
CN114089980A (en) Programming processing method, device, interpreter and nonvolatile storage medium
CN114625372A (en) Automatic component compiling method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination