CN114647768A - Notice information acquisition method, device, equipment, medium and product - Google Patents

Notice information acquisition method, device, equipment, medium and product Download PDF

Info

Publication number
CN114647768A
CN114647768A CN202210279782.1A CN202210279782A CN114647768A CN 114647768 A CN114647768 A CN 114647768A CN 202210279782 A CN202210279782 A CN 202210279782A CN 114647768 A CN114647768 A CN 114647768A
Authority
CN
China
Prior art keywords
target
information
website
field
target website
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210279782.1A
Other languages
Chinese (zh)
Inventor
胡雪惠
林震宇
徐立宇
林晨
陈艺辉
王金哲
陈佳雯
廖婉蓉
张晓丹
林晓东
陈建斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
Original Assignee
China Construction Bank Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp filed Critical China Construction Bank Corp
Priority to CN202210279782.1A priority Critical patent/CN114647768A/en
Publication of CN114647768A publication Critical patent/CN114647768A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The embodiment of the invention relates to the technical field of computers, in particular to a method, a device, equipment, a medium and a product for acquiring announcement information. The method comprises the following steps: receiving a query parameter list, wherein the query parameter list comprises: identification information of a target website, login information of the target website, target keywords and a field to be collected; acquiring a target notice corresponding to the target keyword according to the identification information of the target website, the login information of the target website and the target keyword; according to the technical scheme, the target bulletin information corresponding to the field to be collected is captured from the target bulletin, the bulletin information in a plurality of websites can be automatically collected, the labor cost is reduced, and the collection efficiency is improved.

Description

Notice information acquisition method, device, equipment, medium and product
Technical Field
The embodiment of the invention relates to the technical field of computers, in particular to a method, a device, equipment, a medium and a product for acquiring announcement information.
Background
In the purchasing bid-inviting management activities of enterprises, external bid-inviting information needs to be summarized and sorted frequently, and data support is provided for bid-inviting decisions.
At present, domestic bidding bulletins are distributed in different websites, so that bulletin information collection methods comprise two methods, namely manual data collection and data collection by adopting a web crawler mode.
The manual data collection requires frequent website switching by service personnel, manual visual query and comparison are carried out on issued bidding notices, target notices are identified from query results, key information is copied, bidding notice information is sorted, and information summary materials are formed to serve as bidding decision data supports. However, due to the fact that the announcement information is numerous and the release time is different, the website needs to be frequently logged in manually for query, time and labor are consumed, and efficiency is low.
Data are collected through a web crawler mode: certain access traffic pressure is exerted on the website. And due to the occurrence of various anti-crawler mechanisms, capturing the announcement information in a web crawler manner becomes less stable.
Disclosure of Invention
Embodiments of the present invention provide a method, an apparatus, a device, a medium, and a product for collecting advertisement information, which solve the problems of time and labor consumption and low efficiency in manually collecting advertisement information, and also solve the problems of access traffic pressure and instability of websites caused by collecting advertisement data in a web crawler manner, and can automatically collect advertisement information in a plurality of websites, reduce labor cost, and improve collection efficiency.
According to an aspect of the present invention, there is provided an announcement information collecting method, including:
receiving a query parameter list, wherein the query parameter list comprises: identification information of a target website, login information of the target website, target keywords and a field to be collected;
acquiring a target notice corresponding to the target keyword according to the identification information of the target website, the login information of the target website and the target keyword;
and capturing target bulletin information corresponding to the field to be acquired from the target bulletin according to the field to be acquired.
Further, acquiring the target announcement corresponding to the target keyword according to the identification information of the target website, the login information of the target website, and the target keyword includes:
acquiring a pre-stored website set and a pre-stored field set;
and if the website which is the same as the identification information of the target website exists in the pre-stored website set and the field which is the same as the field to be acquired exists in the pre-stored field set, acquiring the target announcement corresponding to the target keyword according to the identification information of the target website, the login information of the target website and the target keyword.
Further, acquiring a target announcement corresponding to the target keyword according to the identification information of the target website, the login information of the target website, and the target keyword includes:
logging in the target website according to the identification information of the target website and the login information of the target website;
and inquiring the target website according to the target keywords and the identification information of the target website to obtain a target notice corresponding to the target keywords.
Further, the login information of the target website includes: the account number of the target website and the password of the target website.
Further, logging in the target website according to the identification information of the target website and the login information of the target website includes:
inquiring the pre-stored website set according to the identification information of the target website to obtain a position information set corresponding to the identification information of the target website, wherein the position information set comprises: account frame position information, password frame position information and login control position information;
inserting the account of the target website into an account frame according to the account frame position information;
inserting the password of the target website into the password box according to the position information of the password box;
and logging in the target website according to the login control position information, the account of the target website and the password of the target website.
Further, the position information set further includes: and inquiring the position information of the frame.
Further, querying the target website according to the target keyword and the identification information of the target website to obtain a target advertisement corresponding to the target keyword, including:
and inserting the target keywords into the query frame according to the query frame position information to obtain the target bulletin corresponding to the target keywords.
Further, capturing the target bulletin information corresponding to the field to be collected from the target bulletin according to the field to be collected, including:
identifying the target announcement to obtain first position information corresponding to the field to be acquired;
determining position information to be grabbed according to the first position information corresponding to the field to be acquired;
and capturing the target announcement information corresponding to the field to be acquired according to the position information to be captured.
Further, capturing the target bulletin information corresponding to the field to be collected from the target bulletin according to the field to be collected, including:
and capturing target bulletin information corresponding to the field to be acquired from the target bulletin according to the field to be acquired and the identification information of the target website.
Further, capturing target bulletin information corresponding to the field to be collected from the target bulletin according to the field to be collected and the identification information of the target website, including:
inquiring the pre-stored field set according to the identification information of the target website to obtain field position information corresponding to the identification information of the target website, wherein the field position information comprises: the position information to be grabbed corresponding to the field to be acquired;
and capturing the target announcement information corresponding to the field to be acquired according to the position information to be captured.
Further, the field to be collected includes: at least one of a bid number, a release time, an opening time, an entry deadline, bid unit information, and an opening address.
According to another aspect of the present invention, there is provided a notice information collecting apparatus including:
a receiving module, configured to receive a query parameter list, where the query parameter list includes: identification information of a target website, login information of the target website, target keywords and a field to be collected;
the target notice acquisition module is used for acquiring a target notice corresponding to the target keyword according to the identification information of the target website, the login information of the target website and the target keyword;
and the notice information capturing module is used for capturing the target notice information corresponding to the field to be acquired from the target notice according to the field to be acquired.
Further, the target advertisement acquisition module is specifically configured to:
acquiring a pre-stored website set and a pre-stored field set;
and if the website which is the same as the identification information of the target website exists in the pre-stored website set and the field which is the same as the field to be acquired exists in the pre-stored field set, acquiring the target announcement corresponding to the target keyword according to the identification information of the target website, the login information of the target website and the target keyword.
Further, the target advertisement acquisition module is specifically configured to:
logging in the target website according to the identification information of the target website and the login information of the target website;
and inquiring the target website according to the target keywords and the identification information of the target website to obtain a target notice corresponding to the target keywords.
Further, the login information of the target website includes: the account number of the target website and the password of the target website.
Further, the target advertisement acquisition module is specifically configured to:
inquiring the pre-stored website set according to the identification information of the target website to obtain a position information set corresponding to the identification information of the target website, wherein the position information set comprises: account frame position information, password frame position information and login control position information;
inserting the account of the target website into an account frame according to the account frame position information;
inserting the password of the target website into the password box according to the position information of the password box;
and logging in the target website according to the login control position information, the account of the target website and the password of the target website.
Further, the position information set further includes: and inquiring the position information of the frame.
Further, the target advertisement acquisition module is specifically configured to:
and inserting the target keywords into the query frame according to the query frame position information to obtain the target bulletin corresponding to the target keywords.
Further, the announcement information capturing module is specifically configured to:
identifying the target announcement to obtain first position information corresponding to the field to be acquired;
determining position information to be grabbed according to the first position information corresponding to the field to be acquired;
and capturing the target announcement information corresponding to the field to be acquired according to the position information to be captured.
Further, the announcement information capturing module is specifically configured to:
and capturing target bulletin information corresponding to the field to be acquired from the target bulletin according to the field to be acquired and the identification information of the target website.
Further, the announcement information capturing module is specifically configured to:
inquiring the pre-stored field set according to the identification information of the target website to obtain field position information corresponding to the identification information of the target website, wherein the field position information comprises: the position information to be grabbed corresponding to the field to be acquired;
and capturing the target announcement information corresponding to the field to be acquired according to the position information to be captured.
Further, the field to be collected includes: at least one of a bid inviting number, a release time, an opening time, an entry deadline, bid inviting unit information, and an opening address.
According to another aspect of the present invention, there is provided an electronic apparatus including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores a computer program executable by the at least one processor, and the computer program is executed by the at least one processor to enable the at least one processor to execute the announcement information collecting method according to any embodiment of the present invention.
According to another aspect of the present invention, there is provided a computer-readable storage medium storing computer instructions for causing a processor to implement the method for acquiring advertisement information according to any one of the embodiments of the present invention when the computer instructions are executed.
According to another aspect of the present invention, there is provided a computer program product, which when executed by a processor implements the announcement information collecting method according to any one of the embodiments of the present invention.
The embodiment of the invention receives a query parameter list, wherein the query parameter list comprises the following components: the method comprises the steps of identifying information of a target website, login information of the target website, target keywords and fields to be collected; acquiring a target notice corresponding to the target keyword according to the identification information of the target website, the login information of the target website and the target keyword; according to the method, the target bulletin information corresponding to the field to be acquired is captured from the target bulletin according to the field to be acquired, so that the problems of time and labor consumption and low efficiency of manual bulletin information acquisition are solved, the problems of access flow pressure and instability of websites caused by bulletin data acquisition in a web crawler mode are solved, the bulletin information in a plurality of websites can be automatically acquired, the labor cost is reduced, and the acquisition efficiency is improved.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present invention, nor do they necessarily limit the scope of the invention. Other features of the present invention will become apparent from the following description.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
Fig. 1 is a flowchart of an announcement information collection method in an embodiment of the present invention;
fig. 2 is a flowchart of another method for acquiring advertisement information in the embodiment of the present invention;
fig. 3 is a schematic structural diagram of an announcement information collecting device in an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device in the embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The technical scheme related by the application can be used for acquiring, storing and/or processing the data, and the data can meet the relevant regulations of national laws and regulations.
Example one
Fig. 1 is a flowchart of an announcement information collecting method according to an embodiment of the present invention, where this embodiment is applicable to the situation of announcement information collection, and the method may be executed by an announcement information collecting device according to an embodiment of the present invention, where the announcement information collecting device may be implemented in a software and/or hardware manner, and as shown in fig. 1, the method specifically includes the following steps:
s110, receiving a query parameter list, wherein the query parameter list comprises: the method comprises the steps of identifying information of a target website, login information of the target website, target keywords and fields to be collected.
The announcement information acquisition method provided by the embodiment of the invention is executed by the robot agent, and the robot agent is equipment provided with RPA (robot Process Automation) software.
The target website is a bidding website, and the identification information of the target website may be a website of the target website, for example, the query parameter list includes: the website address of the bidding website A and the website address of the bidding website B.
The login information of the target website comprises: the account number of the target website and the password of the target website.
The target keyword may be one word or multiple words, which is not limited in the embodiments of the present invention, and the advertisement is queried based on the target keyword.
The field to be acquired may be one field or multiple fields, which is not limited in this embodiment of the present invention. And capturing announcement information from the announcement based on the field to be acquired, wherein the field to be acquired can be at least one of a bid inviting number, release time, bid opening time, entry deadline, bid inviting unit information and a bid opening address. The bid unit information may include: the address of the bidding unit, the contact way of the bidding unit and the like.
Specifically, the receiving of the query parameter list may be, for example: receiving a query parameter list, wherein the query parameter list comprises: the method comprises the following steps of collecting a website to be collected, a website of a bidding website A, a website of a bidding website B, a website of a bidding website C, an account of the bidding website A, an account of the bidding website B, an account of the bidding website C, a password of the bidding website A, a password of the bidding website B, a password of the bidding website C, a target keyword X and a field Y to be collected.
And S120, acquiring a target notice corresponding to the target keyword according to the identification information of the target website, the login information of the target website and the target keyword.
The target announcement corresponding to the target keyword may include the target keyword in at least one of a title, an abstract and a body text of the target announcement; the target advertisement corresponding to the target keyword may further include a part of the target keyword for at least one of a title, an abstract, and a body text of the target advertisement, for example, if the query parameter list includes: and if the title of the notice contains the target keyword X and the text of the notice contains the target keyword M, determining the notice as the target notice.
Specifically, the manner of obtaining the target announcement corresponding to the target keyword according to the identification information of the target website, the login information of the target website, and the target keyword may be: acquiring a pre-stored website set and a pre-stored field set; and if the website which is the same as the identification information of the target website exists in the pre-stored website set and the field which is the same as the field to be acquired exists in the pre-stored field set, acquiring the target announcement corresponding to the target keyword according to the identification information of the target website, the login information of the target website and the target keyword. The method for obtaining the target announcement corresponding to the target keyword according to the identification information of the target website, the login information of the target website and the target keyword may further include: logging in the target website according to the identification information of the target website and the login information of the target website; and inquiring the target website according to the target keywords and the identification information of the target website to obtain a target notice corresponding to the target keywords. The method for obtaining the target announcement corresponding to the target keyword according to the identification information of the target website, the login information of the target website and the target keyword may further include: inquiring the pre-stored website set according to the identification information of the target website to obtain a position information set corresponding to the identification information of the target website, wherein the position information set comprises: account frame position information, password frame position information and login control position information; inserting the account of the target website into an account frame according to the account frame position information; inserting the password of the target website into the password box according to the position information of the password box; logging in the target website according to the login control position information, the account of the target website and the password of the target website; and inquiring the target website according to the target keywords and the identification information of the target website to obtain a target notice corresponding to the target keywords.
S130, capturing target bulletin information corresponding to the field to be collected from the target bulletin according to the field to be collected.
Wherein, the target advertisement information may be: the announcement information corresponding to the bid-marking number; the target advertisement information may also be: releasing announcement information corresponding to the time and/or the opening time; the target advertisement information may also be: the announcement information corresponding to at least one of the entry deadline, the bidding unit information, and the opening address is not limited in this embodiment of the present invention.
Specifically, the manner of capturing the target announcement information corresponding to the field to be acquired from the target announcement according to the field to be acquired may be: identifying the target announcement to obtain first position information corresponding to the field to be acquired; determining position information to be grabbed according to the first position information corresponding to the field to be acquired; and capturing the target announcement information corresponding to the field to be acquired according to the position information to be captured. The method for capturing the target announcement information corresponding to the field to be acquired from the target announcement according to the field to be acquired can also be as follows: and capturing target bulletin information corresponding to the field to be acquired from the target bulletin according to the field to be acquired and the identification information of the target website. The method for capturing the target notice information corresponding to the field to be acquired from the target notice according to the field to be acquired can also be as follows: inquiring the pre-stored field set according to the identification information of the target website to obtain field position information corresponding to the identification information of the target website, wherein the field position information comprises: the position information to be grabbed corresponding to the field to be acquired; and capturing the target announcement information corresponding to the field to be acquired according to the position information to be captured.
Optionally, obtaining the target advertisement corresponding to the target keyword according to the identification information of the target website, the login information of the target website, and the target keyword includes:
acquiring a pre-stored website set and a pre-stored field set;
and if the website which is the same as the identification information of the target website exists in the pre-stored website set and the field which is the same as the field to be acquired exists in the pre-stored field set, acquiring the target announcement corresponding to the target keyword according to the identification information of the target website, the login information of the target website and the target keyword.
The pre-stored website set is a set constructed by the position information of elements on a plurality of websites acquired by the robot agent terminal based on UI Automation in advance, and the UI Automation is an auxiliary function framework, so that the Windows application program can provide and use programming information related to a user interface. Provides programmatic access to most UI elements on the desktop and enables assistive technology products (e.g., screen readers) to provide end-users with information about and operate the UI through non-standard inputs. UI Automation also allows automated test scripts to interact with the UI. The location information of the elements on the website includes: account frame position information, password frame position information, login control position information and query control position information.
In a specific example, the robot agent obtains position information of an element on a target website a and position information of an element on a target website B in advance based on UI Automation, associates and stores the position information of the element on the target website a and the identification information of the target website a to a website set, associates and stores the position information of the element on the target website B and the identification information of the target website B to the website set, and obtains a pre-stored website set.
The pre-stored field set is a set constructed by position information corresponding to different fields in announcements on a plurality of websites, which is acquired by a robot agent in advance based on UI Automation, and comprises the following steps: the correspondence between the fields and the location information may be, for example, that the set of prestored fields includes: the position information corresponding to the bid inviting field R is position information O, and the position information corresponding to the bid inviting field T is position information P.
Specifically, if a website identical to the identification information of the target website exists in the set of pre-stored websites and a field identical to the field to be acquired exists in the set of pre-stored fields, the target advertisement corresponding to the target keyword is acquired according to the identification information of the target website, the login information of the target website and the target keyword. For example, the set of pre-stored websites may include: the method comprises the steps that the website of a target website A and the position information of an element corresponding to the target website A, the website of a target website B and the position information of an element corresponding to the target website B, the website of a target website C and the position information of an element corresponding to the target website C are determined, and if the identification information of the target website is the website of the target website A, the website which is the same as the identification information of the target website in a pre-stored website set is determined; the set of prestored fields includes: and if the field to be acquired is the bid inviting field R, determining that the field which is the same as the field to be acquired exists in the pre-stored field set.
Optionally, obtaining the target advertisement corresponding to the target keyword according to the identification information of the target website, the login information of the target website, and the target keyword includes:
logging in the target website according to the identification information of the target website and the login information of the target website;
and inquiring the target website according to the target keywords and the identification information of the target website to obtain a target notice corresponding to the target keywords.
Specifically, the method for logging in the target website according to the identification information of the target website and the login information of the target website may be: inquiring the pre-stored website set according to the identification information of the target website to obtain a position information set corresponding to the identification information of the target website, and inserting the account of the target website into an account frame according to the position information of the account frame; inserting the password of the target website into the password box according to the position information of the password box; and logging in the target website according to the login control position information, the account of the target website and the password of the target website.
Specifically, the manner of querying the target website according to the target keyword and the identification information of the target website to obtain the target advertisement corresponding to the target keyword may be: and inquiring the pre-stored website set according to the identification information of the target website to obtain inquiry frame position information corresponding to the identification information of the target website, and inserting the target keywords into an inquiry frame according to the inquiry frame position information to obtain target announcements corresponding to the target keywords.
Optionally, the login information of the target website includes: the account number of the target website and the password of the target website.
Optionally, logging in the target website according to the identification information of the target website and the login information of the target website, including:
inquiring the pre-stored website set according to the identification information of the target website to obtain a position information set corresponding to the identification information of the target website, wherein the position information set comprises: account frame position information, password frame position information and login control position information;
inserting the account of the target website into an account frame according to the account frame position information;
inserting the password of the target website into the password box according to the position information of the password box;
and logging in the target website according to the login control position information, the account of the target website and the password of the target website.
Specifically, the manner of inserting the account of the target website into the account frame according to the account frame position information may be as follows: and the robot agent end carries out input operation on the position information of the account frame and fills the account of the target website into the account frame.
Specifically, the manner of inserting the password of the target website into the password box according to the password box position information may be as follows: and the robot agent end carries out input operation on the position information of the password box and fills the password of the target website into the password box.
Specifically, the method for logging in the target website according to the login control position information, the account of the target website and the password of the target website may be: and the robot agent end carries out Click operation on the position information of the login control to simulate manual login operation.
Optionally, the location information set further includes: and inquiring the position information of the frame.
Optionally, querying the target website according to the target keyword and the identification information of the target website to obtain a target advertisement corresponding to the target keyword, including:
and inserting the target keywords into the query frame according to the query frame position information to obtain the target bulletin corresponding to the target keywords.
Specifically, the manner of inserting the target keyword into the query box according to the query box position information to obtain the target advertisement corresponding to the target keyword may be as follows: and the robot agent end carries out input operation on the position information of the query box and fills the target keywords into the query box.
Optionally, capturing target advertisement information corresponding to the field to be acquired from the target advertisement according to the field to be acquired, including:
identifying the target announcement to obtain first position information corresponding to the field to be acquired;
determining position information to be grabbed according to the first position information corresponding to the field to be acquired;
and capturing the target announcement information corresponding to the field to be acquired according to the position information to be captured.
Specifically, the target announcement is identified to obtain first location information corresponding to the field to be collected, for example, the target announcement may be identified to obtain first location information corresponding to four words, namely, a bid number.
Specifically, the position information to be captured is determined according to the first position information corresponding to the field to be acquired, for example, the position information of a non-chinese character following the first position information corresponding to the four words of the "bid number" may be determined as the position information to be captured.
Specifically, the target announcement information corresponding to the field to be acquired is captured according to the position information to be captured, for example, the non-chinese characters following the four characters of the "bid number" may be obtained according to the position information of the non-chinese characters following the four characters of the "bid number".
Optionally, capturing target advertisement information corresponding to the field to be acquired from the target advertisement according to the field to be acquired, including:
and capturing target bulletin information corresponding to the field to be acquired from the target bulletin according to the field to be acquired and the identification information of the target website.
Specifically, the manner of capturing the target announcement information corresponding to the field to be acquired from the target announcement according to the field to be acquired and the identification information of the target website may be: inquiring the pre-stored field set according to the identification information of the target website to obtain field position information corresponding to the identification information of the target website, wherein the field position information comprises: the position information to be grabbed corresponding to the field to be acquired; and capturing the target announcement information corresponding to the field to be acquired according to the position information to be captured.
Optionally, capturing target bulletin information corresponding to the field to be collected from the target bulletin according to the field to be collected and the identification information of the target website, including:
inquiring the pre-stored field set according to the identification information of the target website to obtain field position information corresponding to the identification information of the target website, wherein the field position information comprises: the position information to be grabbed corresponding to the field to be acquired;
and capturing the target announcement information corresponding to the field to be acquired according to the position information to be captured.
Specifically, the information of the position to be grabbed corresponding to the field to be acquired is: the position information corresponding to the announcement information to be acquired after the field to be acquired may be, for example, position information corresponding to non-chinese characters after four characters, that is, a bid number, if the field to be acquired is the bid number. And if the field to be acquired is 'release time', the position information to be captured is position information corresponding to non-Chinese characters behind the four characters of 'release time'.
It should be noted that, for different bidding websites, the positions of the fields to be collected in the bulletins may have some differences, and therefore, the position information to be captured corresponding to the fields to be collected needs to be obtained by querying the set of prestored fields.
Optionally, the field to be collected includes: at least one of a bid inviting number, a release time, an opening time, an entry deadline, bid inviting unit information, and an opening address.
Wherein the bid inviting unit information comprises: the address of the bidding unit and the contact way of the bidding unit.
In a specific example, as shown in fig. 2, firstly, the service personnel sends a query parameter list for collecting the bidding announcement information to the robot agent. The query parameter list includes: the method comprises the steps of acquiring a website of a target website, a login account of the target website, a password of the target website, target keywords and a field to be acquired. After the robot agent end acquires the parameter list, whether the identification information of the target website and the field to be acquired exceed a preset range is judged, for example: and if the website which is the same as the identification information of the target website exists in the pre-stored website set and the field which is the same as the field to be acquired exists in the pre-stored field set, determining that the list information does not exceed the preset range. If the scope is out of range, the process is failed, and if the scope is not out of range, the robot agent end sequentially opens the login pages of the bidding websites by using the browser in sequence after acquiring the website address of the target website, the account number of the target website and the password of the target website. On the login page, the elements to be acquired are: the robot agent end can identify and acquire position information of the elements through UI Automation. For the account frame and the password frame, the robot agent end directly performs Input operation on the position information of the robot agent end, and values are filled into the account frame and the password frame. For the login button element, a Click operation is performed to perform a simulated manual login operation. And operating the menu after the website logs in by using UI Automation so as to reach the announcement page, inputting the target keyword for inquiry, and inquiring to obtain the announcement of which the title accords with the target keyword. And positioning and grabbing the various types of key fields of the target announcement according to fields to be acquired, such as bid inviting numbers, release time, bid opening time and the like, which are specified in advance in the query parameter list. After each bidding website in the list is queried, all the acquired advertisement information can be summarized and sorted to Excel, stored and filed, and fed back to the service.
According to the technical scheme of the embodiment, a query parameter list is received, wherein the query parameter list comprises: identification information of a target website, login information of the target website, target keywords and a field to be collected; acquiring a target notice corresponding to the target keyword according to the identification information of the target website, the login information of the target website and the target keyword; according to the method, the target bulletin information corresponding to the field to be acquired is captured from the target bulletin according to the field to be acquired, so that the problems of time and labor consumption and low efficiency of manual bulletin information acquisition are solved, the problems of access flow pressure and instability of websites caused by bulletin data acquisition in a web crawler mode are solved, the bulletin information in a plurality of websites can be automatically acquired, the labor cost is reduced, and the acquisition efficiency is improved.
Example two
Fig. 3 is a schematic structural diagram of an announcement information collecting device according to an embodiment of the present invention. This embodiment is applicable to the case of collecting the announcement information, the apparatus can be implemented in a software and/or hardware manner, and the apparatus can be integrated into any device providing the function of collecting the announcement information, as shown in fig. 3, where the apparatus specifically includes: a receiving module 210, a target advertisement obtaining module 220 and an advertisement information grabbing module 230.
The receiving module is configured to receive a query parameter list, where the query parameter list includes: identification information of a target website, login information of the target website, target keywords and a field to be collected;
the target notice acquisition module is used for acquiring a target notice corresponding to the target keyword according to the identification information of the target website, the login information of the target website and the target keyword;
and the notice information capturing module is used for capturing the target notice information corresponding to the field to be acquired from the target notice according to the field to be acquired.
Optionally, the target advertisement obtaining module is specifically configured to:
acquiring a pre-stored website set and a pre-stored field set;
and if the website which is the same as the identification information of the target website exists in the pre-stored website set and the field which is the same as the field to be acquired exists in the pre-stored field set, acquiring the target announcement corresponding to the target keyword according to the identification information of the target website, the login information of the target website and the target keyword.
Optionally, the target advertisement obtaining module is specifically configured to:
logging in the target website according to the identification information of the target website and the login information of the target website;
and inquiring the target website according to the target keywords and the identification information of the target website to obtain a target notice corresponding to the target keywords.
Optionally, the login information of the target website includes: the account number of the target website and the password of the target website.
Optionally, the target advertisement obtaining module is specifically configured to:
inquiring the pre-stored website set according to the identification information of the target website to obtain a position information set corresponding to the identification information of the target website, wherein the position information set comprises: account frame position information, password frame position information and login control position information;
inserting the account of the target website into an account frame according to the account frame position information;
inserting the password of the target website into the password box according to the position information of the password box;
and logging in the target website according to the login control position information, the account of the target website and the password of the target website.
Optionally, the location information set further includes: and inquiring the position information of the frame.
Optionally, the target advertisement obtaining module is specifically configured to:
and inserting the target keywords into the query frame according to the query frame position information to obtain the target bulletin corresponding to the target keywords.
Optionally, the announcement information capturing module is specifically configured to:
identifying the target announcement to obtain first position information corresponding to the field to be acquired;
determining position information to be grabbed according to the first position information corresponding to the field to be acquired;
and capturing the target announcement information corresponding to the field to be acquired according to the position information to be captured.
Optionally, the announcement information capturing module is specifically configured to:
and capturing target bulletin information corresponding to the field to be acquired from the target bulletin according to the field to be acquired and the identification information of the target website.
Optionally, the announcement information capturing module is specifically configured to:
inquiring the pre-stored field set according to the identification information of the target website to obtain field position information corresponding to the identification information of the target website, wherein the field position information comprises: the position information to be grabbed corresponding to the field to be acquired;
and capturing the target announcement information corresponding to the field to be acquired according to the position information to be captured.
Optionally, the field to be collected includes: at least one of a bid number, a release time, an opening time, an entry deadline, bid unit information, and an opening address.
The product can execute the method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
According to the technical scheme of the embodiment, a query parameter list is received, wherein the query parameter list comprises: identification information of a target website, login information of the target website, target keywords and a field to be collected; acquiring a target notice corresponding to the target keyword according to the identification information of the target website, the login information of the target website and the target keyword; according to the method, the target bulletin information corresponding to the field to be acquired is captured from the target bulletin according to the field to be acquired, so that the problems of time and labor consumption and low efficiency of manual bulletin information acquisition are solved, the problems of access flow pressure and instability of websites caused by bulletin data acquisition in a web crawler mode are solved, the bulletin information in a plurality of websites can be automatically acquired, the labor cost is reduced, and the acquisition efficiency is improved.
EXAMPLE III
FIG. 4 shows a schematic block diagram of an electronic device 10 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital assistants, cellular phones, smart phones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 4, the electronic device 10 includes at least one processor 11, and a memory communicatively connected to the at least one processor 11, such as a Read Only Memory (ROM)12, a Random Access Memory (RAM)13, and the like, wherein the memory stores a computer program executable by the at least one processor, and the processor 11 can perform various suitable actions and processes according to the computer program stored in the Read Only Memory (ROM)12 or the computer program loaded from a storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data necessary for the operation of the electronic apparatus 10 may also be stored. The processor 11, the ROM 12, and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
A number of components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, or the like; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
Processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, or the like. The processor 11 performs the various methods and processes described above, such as the notice information acquisition method:
receiving a query parameter list, wherein the query parameter list comprises: identification information of a target website, login information of the target website, target keywords and a field to be collected;
acquiring a target notice corresponding to the target keyword according to the identification information of the target website, the login information of the target website and the target keyword;
and capturing target bulletin information corresponding to the field to be acquired from the target bulletin according to the field to be acquired.
In some embodiments, the bulletin information collecting method may be implemented as a computer program tangibly embodied in a computer-readable storage medium, such as the storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into the RAM 13 and executed by the processor 11, one or more steps of the above-described notice information collecting method may be performed. Alternatively, in other embodiments, the processor 11 may be configured by any other suitable means (e.g., by means of firmware) to perform the announcement information collection method:
receiving a query parameter list, wherein the query parameter list comprises: identification information of a target website, login information of the target website, target keywords and a field to be collected;
acquiring a target notice corresponding to the target keyword according to the identification information of the target website, the login information of the target website and the target keyword;
and capturing target notice information corresponding to the field to be acquired from the target notice according to the field to be acquired.
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for implementing the methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be performed. A computer program can execute entirely on a machine, partly on a machine, as a stand-alone software package partly on a machine and partly on a remote machine or entirely on a remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. A computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service are overcome.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present invention may be executed in parallel, sequentially, or in different orders, and are not limited herein as long as the desired results of the technical solution of the present invention can be achieved.
An embodiment of the present invention further provides a computer program product, which includes a computer program, and when the computer program is executed by a processor, the method for acquiring advertisement information according to any embodiment of the present invention is implemented.
The above-described embodiments should not be construed as limiting the scope of the invention. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (15)

1. An announcement information acquisition method is characterized by comprising the following steps:
receiving a query parameter list, wherein the query parameter list comprises: identification information of a target website, login information of the target website, target keywords and a field to be collected;
acquiring a target notice corresponding to the target keyword according to the identification information of the target website, the login information of the target website and the target keyword;
and capturing target bulletin information corresponding to the field to be acquired from the target bulletin according to the field to be acquired.
2. The method of claim 1, wherein obtaining the target advertisement corresponding to the target keyword according to the identification information of the target website, the login information of the target website, and the target keyword comprises:
acquiring a pre-stored website set and a pre-stored field set;
and if the website which is the same as the identification information of the target website exists in the pre-stored website set and the field which is the same as the field to be acquired exists in the pre-stored field set, acquiring the target announcement corresponding to the target keyword according to the identification information of the target website, the login information of the target website and the target keyword.
3. The method of claim 2, wherein obtaining the target bulletin corresponding to the target keyword according to the identification information of the target website, the login information of the target website, and the target keyword comprises:
logging in the target website according to the identification information of the target website and the login information of the target website;
and inquiring the target website according to the target keywords and the identification information of the target website to obtain a target notice corresponding to the target keywords.
4. The method of claim 3, wherein the login information of the target website comprises: the account number of the target website and the password of the target website.
5. The method of claim 4, wherein logging in the target website according to the identification information of the target website and the login information of the target website comprises:
inquiring the pre-stored website set according to the identification information of the target website to obtain a position information set corresponding to the identification information of the target website, wherein the position information set comprises: account frame position information, password frame position information and login control position information;
inserting the account of the target website into an account frame according to the account frame position information;
inserting the password of the target website into the password box according to the position information of the password box;
and logging in the target website according to the login control position information, the account of the target website and the password of the target website.
6. The method of claim 5, wherein the set of location information further comprises: and inquiring the position information of the frame.
7. The method of claim 6, wherein querying the target website according to the target keyword and the identification information of the target website to obtain a target advertisement corresponding to the target keyword comprises:
and inserting the target keywords into the query frame according to the query frame position information to obtain the target bulletin corresponding to the target keywords.
8. The method according to claim 1, wherein capturing the target bulletin information corresponding to the field to be collected from the target bulletin according to the field to be collected comprises:
identifying the target announcement to obtain first position information corresponding to the field to be acquired;
determining position information to be grabbed according to the first position information corresponding to the field to be acquired;
and capturing the target announcement information corresponding to the field to be acquired according to the position information to be captured.
9. The method according to claim 2, wherein capturing the target bulletin information corresponding to the field to be collected from the target bulletin according to the field to be collected comprises:
and capturing target bulletin information corresponding to the field to be acquired from the target bulletin according to the field to be acquired and the identification information of the target website.
10. The method according to claim 9, wherein capturing the target bulletin information corresponding to the field to be collected from the target bulletin according to the field to be collected and the identification information of the target website comprises:
inquiring the pre-stored field set according to the identification information of the target website to obtain field position information corresponding to the identification information of the target website, wherein the field position information comprises: the position information to be grabbed corresponding to the field to be acquired;
and capturing the target announcement information corresponding to the field to be acquired according to the position information to be captured.
11. The method according to any one of claims 1-10, wherein the field to be acquired comprises: at least one of a bid number, a release time, an opening time, an entry deadline, bid unit information, and an opening address.
12. An announcement information collection device characterized by comprising:
a receiving module, configured to receive a query parameter list, where the query parameter list includes: identification information of a target website, login information of the target website, target keywords and a field to be collected;
a target notice acquisition module, configured to acquire a target notice corresponding to the target keyword according to the identification information of the target website, the login information of the target website, and the target keyword;
and the notice information capturing module is used for capturing the target notice information corresponding to the field to be acquired from the target notice according to the field to be acquired.
13. An electronic device, characterized in that the electronic device comprises:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores a computer program executable by the at least one processor, the computer program being executable by the at least one processor to enable the at least one processor to perform the announcement information collection method of any of claims 1-11.
14. A computer-readable storage medium storing computer instructions for causing a processor to implement the advertisement information collecting method of any one of claims 1 to 11 when executed.
15. A computer program product, characterized in that the computer program product comprises a computer program which, when being executed by a processor, implements the announcement information collection method according to any of claims 1-11.
CN202210279782.1A 2022-03-21 2022-03-21 Notice information acquisition method, device, equipment, medium and product Pending CN114647768A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210279782.1A CN114647768A (en) 2022-03-21 2022-03-21 Notice information acquisition method, device, equipment, medium and product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210279782.1A CN114647768A (en) 2022-03-21 2022-03-21 Notice information acquisition method, device, equipment, medium and product

Publications (1)

Publication Number Publication Date
CN114647768A true CN114647768A (en) 2022-06-21

Family

ID=81994585

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210279782.1A Pending CN114647768A (en) 2022-03-21 2022-03-21 Notice information acquisition method, device, equipment, medium and product

Country Status (1)

Country Link
CN (1) CN114647768A (en)

Similar Documents

Publication Publication Date Title
CN107390983B (en) Service instruction execution method, client and storage medium
CN112561332A (en) Model management method, model management apparatus, electronic device, storage medium, and program product
CN112269706A (en) Interface parameter checking method and device, electronic equipment and computer readable medium
CN113205320A (en) Service processing method and device, electronic equipment and computer readable medium
CN116611411A (en) Business system report generation method, device, equipment and storage medium
CN116545905A (en) Service health detection method and device, electronic equipment and storage medium
CN115048352B (en) Log field extraction method, device, equipment and storage medium
CN115860877A (en) Product marketing method, device, equipment and medium
CN114647768A (en) Notice information acquisition method, device, equipment, medium and product
CN115757304A (en) Log storage method, device and system, electronic equipment and storage medium
CN114706610A (en) Business flow chart generation method, device, equipment and storage medium
CN114693116A (en) Method and device for detecting code review validity and electronic equipment
CN111694686B (en) Processing method and device for abnormal service, electronic equipment and storage medium
CN113742501A (en) Information extraction method, device, equipment and medium
CN111176982A (en) Test interface generation method and device
CN111611476A (en) Method and device for displaying special topic page
CN113434432B (en) Performance test method, device, equipment and medium for recommendation platform
CN113095788A (en) Question distribution method and device, electronic equipment and storage medium
CN115391374A (en) Data matching method and device, electronic equipment and storage medium
CN115061664A (en) Object conversion method and device, electronic equipment and storage medium
CN115840604A (en) Data processing method and device, electronic equipment and computer readable storage medium
CN117785413A (en) Task forwarding method, device, equipment and storage medium
CN117251196A (en) Data maintenance method, device, equipment and storage medium
CN116434401A (en) Traffic management method, device, equipment and medium for user entering factory
CN115510357A (en) Method, device, equipment and medium for generating network page blind road

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination