CN113409111A - Bidding information processing method, system and readable storage medium - Google Patents

Bidding information processing method, system and readable storage medium Download PDF

Info

Publication number
CN113409111A
CN113409111A CN202110660790.6A CN202110660790A CN113409111A CN 113409111 A CN113409111 A CN 113409111A CN 202110660790 A CN202110660790 A CN 202110660790A CN 113409111 A CN113409111 A CN 113409111A
Authority
CN
China
Prior art keywords
bidding
bid
name
information processing
character
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110660790.6A
Other languages
Chinese (zh)
Inventor
廉建林
罗杰华
陈家儒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Bidi Data Technology Co ltd
Original Assignee
Guangzhou Bidi Data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Bidi Data Technology Co ltd filed Critical Guangzhou Bidi Data Technology Co ltd
Priority to CN202110660790.6A priority Critical patent/CN113409111A/en
Publication of CN113409111A publication Critical patent/CN113409111A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0611Request for offers or quotes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Abstract

The invention provides a bid inviting and bidding information processing method, a bid inviting and bidding information processing system and a readable storage medium, wherein bid inviting information is collected from the Internet, and a linked entry of a detail page of a bid inviting project is obtained; requesting a detail page link of the bidding announcement, analyzing the issuing time of the bidding announcement and the specific details of the bidding announcement, and classifying the bidding announcement according to a preset announcement classification rule; the elements for extracting bidding information include item number, item name, bidder, agent, and bidder. The bidding information can be automatically and efficiently acquired by the method and the system.

Description

Bidding information processing method, system and readable storage medium
Technical Field
The present invention relates to the field of internet bidding, and more particularly, to a bidding information processing method, system and readable storage medium.
Background
When business processing is carried out, enterprises need to process bidding information frequently, and therefore latest dynamic states of bidding announcements in various bidding websites need to be checked in real time. This requires the arrangement of specialized personnel for monitoring, but is limited by the inefficiency and time and labor involved in manually viewing the information. How to automatically acquire bidding information is urgent and can not be solved.
Disclosure of Invention
In view of the above problems, it is an object of the present invention to provide a bid information processing method, system and readable storage medium, which can automatically and efficiently acquire bid information.
In order to solve the technical problems, the technical scheme of the invention is as follows:
the invention provides a bid information processing method in a first aspect, which comprises the following steps:
s1: collecting bid inviting information from the Internet and acquiring a linked entry of a bid inviting item detail page;
s2: requesting a detail page link of the bidding announcement, acquiring a detail page of the bidding project, analyzing the publishing time of the bidding announcement and the specific details of the bidding announcement in the detail page, and classifying the bidding announcement according to a preset announcement classification rule, wherein the category of the bidding announcement comprises any one or more of the following categories: the method comprises the following steps of changing announcements, tendering announcements, winning bid information, tendering preview, tendering answer questions, tendering documents, qualification results, laws and regulations, news information, proposed projects, exhibition promotion and owner purchase, and then storing the results into a corresponding classified table in an oracle database;
s3: the elements for extracting bidding information include item number, item name, bidder, agent, and bidder.
In the scheme, in step S1, collecting bidding information is based on a data collection system developed by java language, and the framework is based on an open-source webmagic framework.
In this scheme, in step S1, in the process of collecting bidding information, a summary entry of a project is located, and a link entry of a bidding announcement detail page is obtained according to a specific jsup or Xpath parsing rule, or a regular expression.
In this embodiment, in step S3, the extracting elements of the bid information specifically includes:
s31: html data of the bidding project detail page is processed, and the html data is converted into text data through a python third-party library bs 4;
s32: sentence dividing is carried out on the text data;
s33: obtaining the classification of each character in the text data after sentence division:
s34: obtaining a project number and a project name by matching a normal project number and a normal project name category string;
s35: and acquiring the tenderer, the agent and the bidder.
In this embodiment, the steps S33-S34 specifically include:
obtaining the classification of each character through a deep learning model of a word vector + bidirectional LSTM + CRF, specifically nine classifications of B _ code, M _ code, E _ code, S _ code, B _ name, M _ name, E _ name, S _ name and 0, which respectively represent a number starting character, a number middle character, a number end character, a single number character, a name starting character, a name middle character, a name end character, a single name character and a common character;
and finally obtaining the item number and the item name by matching the normal item number and item name category strings, such as B _ code + M _ code x n + E _ code, B _ name + M _ name x n + E _ name.
In this scheme, in step S35, acquiring a bidder, an agent, and a bidder specifically includes:
the method comprises the steps of firstly identifying enterprise names through a python third-party library foolnltk, then taking corresponding context through the position where each name appears for a stock company, and identifying classification of the company through a deep learning model (the deep learning model comprises but is not limited to word vectors + bidirectional LSTM + softmax) according to the context, wherein classification results comprise a tenderer, an agent, a first winning bid candidate, a second winning bid candidate, a third winning bid candidate and none.
In this embodiment, the method further includes:
and providing a bid item searching interface, receiving a searching condition input by a user, and pushing bid information according to the searching condition. For example, the term can be searched in all directions by using a condition search such as term keywords, regions, announcement distribution time, information categories and the like, and an advanced search mode such as precision, fuzzy, intelligence and the like can be used.
In this embodiment, the method further includes:
and automatically pushing bid information according to the search condition, the pushing time and the receiving mode input by the user.
The second aspect of the present invention also provides a bid information processing system, including:
an entry acquisition module: the system comprises a data processing module, a data processing module and a data processing module, wherein the data processing module is used for acquiring bidding information from the Internet and acquiring a detailed page link entry of a bidding project;
the bidding announcement classification module: the method comprises the following steps of linking a detail page for requesting the bidding announcement, acquiring the detail page of the bidding project, analyzing the publishing time of the bidding announcement and the specific details of the bidding announcement in the detail page, and classifying the bidding announcement according to a preset announcement classification rule, wherein the category of the bidding announcement comprises any one or more of the following categories: the method comprises the following steps of changing announcements, tendering announcements, winning bid information, tendering preview, tendering answer questions, tendering documents, qualification results, laws and regulations, news information, proposed projects, exhibition promotion and owner purchase, and then storing the results into a corresponding classified table in an oracle database;
the bid information extraction module: elements for extracting bid information include a project number, a project name, a bidder, an agent, and a bidder.
A third aspect of the present invention provides a computer-readable storage medium having embodied therein a bid information processing method program, which when executed by a processor implements the bid information processing method.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that: the invention provides a bid inviting and bidding information processing method, a bid inviting and bidding information processing system and a readable storage medium, wherein bid inviting information is collected from the Internet, and a linked entry of a detail page of a bid inviting project is obtained; requesting a detail page link of the bidding announcement, analyzing the issuing time of the bidding announcement and the specific details of the bidding announcement, and classifying the bidding announcement according to a preset announcement classification rule; the elements for extracting bidding information include item number, item name, bidder, agent, and bidder. The bidding information can be automatically and efficiently acquired by the method and the system.
Drawings
Fig. 1 is a flowchart of a bidding information processing method according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of a bidding information processing system according to an embodiment of the present invention.
Detailed Description
In order that the above objects, features and advantages of the present invention can be more clearly understood, a more particular description of the invention will be rendered by reference to the appended drawings. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced in other ways than those specifically described herein, and therefore the scope of the present invention is not limited by the specific embodiments disclosed below.
As shown in fig. 1, the present invention discloses a bid information processing method, comprising the following steps:
s1: collecting bid inviting information from the Internet and acquiring a linked entry of a bid inviting item detail page;
s2: requesting a detail page link of the bidding announcement, acquiring a detail page of the bidding project, analyzing the publishing time of the bidding announcement and the specific details of the bidding announcement in the detail page, and classifying the bidding announcement according to a preset announcement classification rule, wherein the category of the bidding announcement comprises any one or more of the following categories: the method comprises the following steps of changing announcements, tendering announcements, winning bid information, tendering preview, tendering answer questions, tendering documents, qualification results, laws and regulations, news information, proposed projects, exhibition promotion and owner purchase, and then storing the results into a corresponding classified table in an oracle database;
s3: the elements for extracting bidding information include item number, item name, bidder, agent, and bidder.
In step S1, the collecting bidding information is based on a java-language-developed data collection system, and the framework is based on an open-source webmagic framework.
It should be noted that webmagic is an open-source Java vertical crawler framework, and aims to simplify the development process of a crawler and let a developer concentrate on development of logical functions. The webmagic adopts a fully modular design, and has the functions of covering the life cycle of the whole crawler (link extraction, page downloading, content extraction and persistence), supporting multi-thread grabbing and distributed grabbing, and supporting functions of automatic retry, user-defined UA/cookie and the like. The webmagic comprises a page extraction function, and developers can use cs selectors, xpaths and regular expressions to extract links and contents, and support chain calling of a plurality of selectors.
According to the embodiment of the invention, in step S1, in the process of collecting bidding information, a summary entry of an item is located, and a link entry of a bidding announcement detail page is obtained according to a specific jsup or Xpath parsing rule or a regular expression.
It should be noted that the jsup is a Java HTML parser, and can directly parse a certain URL address and HTML text content. It provides a very labor-saving set of APIs that can fetch and manipulate data through DOM, CSS and jQuery-like manipulation methods.
XPath, the full XML Path Language, is an XML Path Language that is a Language for looking up information in XML documents. Originally intended for searching XML documents, but the same applies to searching HTML documents. So XPath can be used for corresponding information extraction when the crawler is made.
Regular expression (regular expression) describes a pattern of matching character strings, which can be used to check whether a string contains a certain substring, replace the matching substring, or take out a substring meeting a certain condition from a certain string, etc.
According to the embodiment of the present invention, in step S3, the extracting elements of the bid information specifically includes:
s31: html data of the bidding project detail page is processed, and the html data is converted into text data through a python third-party library bs 4;
s32: sentence dividing is carried out on the text data;
s33: obtaining the classification of each character in the text data after sentence division:
s34: obtaining a project number and a project name by matching a normal project number and a normal project name category string;
s35: and acquiring the tenderer, the agent and the bidder.
It should be noted that bs4 is called Beatiful Soup, and provides some simple functions of python formula for processing navigation, search, modification of parse tree, and so on. The input document can be automatically converted into a Unicode code, and the output document can be automatically converted into an utf-8 code.
According to the embodiment of the present invention, the steps S33-S34 specifically include:
obtaining the classification of each character through a deep learning model of a word vector + bidirectional LSTM + CRF, specifically, nine classifications, namely B _ code, M _ code, E _ code, S _ code, B _ name, M _ name, E _ name, S _ name and O, respectively representing a numbering beginning character, a numbering middle character, a numbering end character, a single numbering character, a name beginning character, a name middle character, a name end character, a single name character and a common character;
and finally obtaining the item number and the item name by matching the normal item number and item name category strings, such as B _ code + M _ code x n + E _ code, B _ name + M _ name x n + E _ name.
According to the embodiment of the present invention, in step S35, the acquiring a tenderer, an agent, and a bidder specifically includes:
the method comprises the steps of firstly identifying enterprise names through a python third-party library foolnltk, then taking corresponding context through the position where each name appears for a stock company, and identifying classification of the company through a deep learning model (the deep learning model comprises but is not limited to word vectors + bidirectional LSTM + softmax) according to the context, wherein classification results comprise a tenderer, an agent, a first winning bid candidate, a second winning bid candidate, a third winning bid candidate and none.
According to an embodiment of the invention, the method further comprises:
and providing a bid item searching interface, receiving a searching condition input by a user, and pushing bid information according to the searching condition. For example, the term can be searched in all directions by using a condition search such as term keywords, regions, announcement distribution time, information categories and the like, and an advanced search mode such as precision, fuzzy, intelligence and the like can be used.
According to an embodiment of the invention, the method further comprises:
and automatically pushing bid information according to the search condition, the pushing time and the receiving mode input by the user.
As shown in fig. 2, the present invention discloses a bid information processing system, comprising:
an entry acquisition module: the system comprises a data processing module, a data processing module and a data processing module, wherein the data processing module is used for acquiring bidding information from the Internet and acquiring a detailed page link entry of a bidding project;
the bidding announcement classification module: the method comprises the following steps of linking a detail page for requesting the bidding announcement, acquiring the detail page of the bidding project, analyzing the publishing time of the bidding announcement and the specific details of the bidding announcement in the detail page, and classifying the bidding announcement according to a preset announcement classification rule, wherein the category of the bidding announcement comprises any one or more of the following categories: the method comprises the following steps of changing announcements, tendering announcements, winning bid information, tendering preview, tendering answer questions, tendering documents, qualification results, laws and regulations, news information, proposed projects, exhibition promotion and owner purchase, and then storing the results into a corresponding classified table in an oracle database;
the bid information extraction module: elements for extracting bid information include a project number, a project name, a bidder, an agent, and a bidder.
A third aspect of the present invention provides a computer-readable storage medium having embodied therein a bid information processing method program, which when executed by a processor implements the bid information processing method.
The invention discloses a bid inviting and bidding information processing method, a bid inviting and bidding information processing system and a readable storage medium, wherein bid inviting information is collected from the Internet, and a linked entry of a detail page of a bid inviting project is obtained; requesting a detail page link of the bidding announcement, analyzing the issuing time of the bidding announcement and the specific details of the bidding announcement, and classifying the bidding announcement according to a preset announcement classification rule; the elements for extracting bidding information include item number, item name, bidder, agent, and bidder. The bidding information can be automatically and efficiently acquired by the method and the system.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described device embodiments are merely illustrative, for example, the division of the unit is only a logical functional division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units; can be located in one place or distributed on a plurality of network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, all the functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may be separately regarded as one unit, or two or more units may be integrated into one unit; the integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
Those of ordinary skill in the art will understand that: all or part of the steps for realizing the method embodiments can be completed by hardware related to program instructions, the program can be stored in a computer readable storage medium, and the program executes the steps comprising the method embodiments when executed; and the aforementioned storage medium includes: a mobile storage device, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Alternatively, the integrated unit of the present invention may be stored in a computer-readable storage medium if it is implemented in the form of a software functional module and sold or used as a separate product. Based on such understanding, the technical solutions of the embodiments of the present invention may be essentially implemented or a part contributing to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the methods described in the embodiments of the present invention. And the aforementioned storage medium includes: a removable storage device, a ROM, a RAM, a magnetic or optical disk, or various other media that can store program code.

Claims (10)

1. A bid information processing method, comprising the steps of:
s1: collecting bid inviting information from the Internet and acquiring a linked entry of a bid inviting item detail page;
s2: requesting detail page links of the bidding notices, acquiring detail pages of bidding projects, analyzing the issuing time of the bidding notices and the specific details of the bidding notices in the detail pages, classifying the bidding notices according to preset notice classification rules, and storing the sorted notices in a corresponding classified table in a database;
s3: the elements for extracting bidding information include item number, item name, bidder, agent, and bidder.
2. The bid-inviting and bidding information processing method according to claim 1, wherein in step S1, the bid-inviting information collection is based on a java language development data collection system, and the frame is based on an open-source webmagic frame.
3. The bid-inviting and bidding information processing method according to claim 2, wherein in step S1, during the process of collecting bid-inviting information, a summary entry of items is located, and a bid-inviting notice detail page link entry is obtained according to a specific jsup or Xpath parsing rule or a regular expression.
4. The bid information processing method according to claim 1, wherein the extracting of the elements of the bid information in step S3 specifically includes:
s31: html data of the bidding project detail page is processed, and the html data is converted into text data through a python third-party library bs 4;
s32: sentence dividing is carried out on the text data;
s33: obtaining the classification of each character in the text data after sentence division:
s34: obtaining a project number and a project name by matching a normal project number and a normal project name category string;
s35: and acquiring the tenderer, the agent and the bidder.
5. The bid information processing method of claim 4, wherein the steps S33-S34 specifically include:
obtaining the classification of each character through a deep learning model of a word vector + bidirectional LSTM + CRF, specifically nine classifications of B _ code, M _ code, E _ code, S _ code, B _ name, M _ name, E _ name, S _ name and 0, which respectively represent a number starting character, a number middle character, a number end character, a single number character, a name starting character, a name middle character, a name end character, a single name character and a common character;
and finally obtaining the item number and the item name by matching the normal item number and item name category strings, such as B _ code + M _ code x n + E _ code, B _ name + M _ name x n + E _ name.
6. The bid information processing method of claim 4, wherein in step S35, the step of obtaining a bidder, an agent, and a bidder specifically comprises:
the method comprises the steps of firstly identifying enterprise names through a python third-party library foolnltk, then taking corresponding context through the position where each name appears for a stock company, and identifying classification of the company through a deep learning model (the deep learning model comprises but is not limited to word vectors + bidirectional LSTM + softmax) according to the context, wherein classification results comprise a tenderer, an agent, a first winning bid candidate, a second winning bid candidate, a third winning bid candidate and none.
7. The bid information processing method of claim 1, further comprising:
and providing a bid item searching interface, receiving a searching condition input by a user, and pushing bid information according to the searching condition.
8. The bid information processing method of claim 7, wherein the method further comprises:
and automatically pushing bid information according to the search condition, the pushing time and the receiving mode input by the user.
9. A bid information processing system, comprising:
an entry acquisition module: the system comprises a data processing module, a data processing module and a data processing module, wherein the data processing module is used for acquiring bidding information from the Internet and acquiring a detailed page link entry of a bidding project;
the bidding announcement classification module: the method comprises the steps that a detail page link for requesting the bidding announcement is obtained, a detail page of a bidding project is obtained, the publishing time of the bidding announcement and the specific details of the bidding announcement are analyzed in the detail page, the bidding announcement is classified according to a preset announcement classification rule, and then the classified listing is stored in a corresponding classified table in a database;
the bid information extraction module: elements for extracting bid information include a project number, a project name, a bidder, an agent, and a bidder.
10. A computer-readable storage medium, characterized in that a bid information processing method program is included in the computer-readable storage medium, and when executed by a processor, implements the bid information processing method according to any one of claims 1 to 8.
CN202110660790.6A 2021-06-15 2021-06-15 Bidding information processing method, system and readable storage medium Pending CN113409111A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110660790.6A CN113409111A (en) 2021-06-15 2021-06-15 Bidding information processing method, system and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110660790.6A CN113409111A (en) 2021-06-15 2021-06-15 Bidding information processing method, system and readable storage medium

Publications (1)

Publication Number Publication Date
CN113409111A true CN113409111A (en) 2021-09-17

Family

ID=77683825

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110660790.6A Pending CN113409111A (en) 2021-06-15 2021-06-15 Bidding information processing method, system and readable storage medium

Country Status (1)

Country Link
CN (1) CN113409111A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115187349A (en) * 2022-09-13 2022-10-14 工保科技(浙江)有限公司 Information processing method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015194955A (en) * 2014-03-31 2015-11-05 株式会社ナビット Bid information search system
CN107239891A (en) * 2017-05-26 2017-10-10 山东省科学院情报研究所 A kind of bid checking method based on big data
CN108563729A (en) * 2018-04-04 2018-09-21 福州大学 A kind of bidding website acceptance of the bid information extraction method based on dom tree
CN111506795A (en) * 2020-04-20 2020-08-07 北京中电普华信息技术有限公司 Bidding information acquisition method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015194955A (en) * 2014-03-31 2015-11-05 株式会社ナビット Bid information search system
CN107239891A (en) * 2017-05-26 2017-10-10 山东省科学院情报研究所 A kind of bid checking method based on big data
CN108563729A (en) * 2018-04-04 2018-09-21 福州大学 A kind of bidding website acceptance of the bid information extraction method based on dom tree
CN111506795A (en) * 2020-04-20 2020-08-07 北京中电普华信息技术有限公司 Bidding information acquisition method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张应成等: "基于BiLSTM-CRF的商情实体识别模型", 《计算机工程》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115187349A (en) * 2022-09-13 2022-10-14 工保科技(浙江)有限公司 Information processing method and device

Similar Documents

Publication Publication Date Title
Di Lucca et al. An approach to identify duplicated web pages
US20090125529A1 (en) Extracting information based on document structure and characteristics of attributes
US20060293879A1 (en) Learning facts from semi-structured text
US11550856B2 (en) Artificial intelligence for product data extraction
US20100169311A1 (en) Approaches for the unsupervised creation of structural templates for electronic documents
US20080104037A1 (en) Automated scheme for identifying user intent in real-time
CN100462969C (en) Method for providing and inquiry information for public by interconnection network
Krotov et al. Research note: Scraping financial data from the web using the R language
US20160063062A1 (en) Code searching and ranking
CN113326413A (en) Webpage information extraction method, system, server and storage medium
CN114443928B (en) Web text data crawler method and system
CN115438162A (en) Knowledge graph-based disease question-answering method, system, equipment and storage medium
US6772395B1 (en) Self-modifying data flow execution architecture
Kumar Apache Solr search patterns
CN113409111A (en) Bidding information processing method, system and readable storage medium
CN104778232B (en) Searching result optimizing method and device based on long query
CN113704667A (en) Automatic extraction processing method and device for bidding announcement
Liu et al. An XML-enabled data extraction toolkit for web sources
US11328005B2 (en) Machine learning (ML) based expansion of a data set
Crescenzi et al. Wrapper inference for ambiguous web pages
CN115422427A (en) Employment skill requirement analysis system
CN111666479A (en) Method for searching web page and computer readable storage medium
Agrawal et al. FACT-Fine grained Assessment of web page CredibiliTy
Zhang et al. Research on keyword extraction and sentiment orientation analysis of educational texts
CN114117242A (en) Data query method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210917