CN105389338A - Analysis method of procurement bid wining data - Google Patents

Analysis method of procurement bid wining data Download PDF

Info

Publication number
CN105389338A
CN105389338A CN201510683420.9A CN201510683420A CN105389338A CN 105389338 A CN105389338 A CN 105389338A CN 201510683420 A CN201510683420 A CN 201510683420A CN 105389338 A CN105389338 A CN 105389338A
Authority
CN
China
Prior art keywords
bid
acceptance
data
record
analytic method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510683420.9A
Other languages
Chinese (zh)
Other versions
CN105389338B (en
Inventor
陈国强
姬永杰
朱培冬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing UYU Government Software Co.,Ltd.
Original Assignee
BEIJING UFIDA SOFTWARE CO LTD
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIJING UFIDA SOFTWARE CO LTD filed Critical BEIJING UFIDA SOFTWARE CO LTD
Priority to CN201510683420.9A priority Critical patent/CN105389338B/en
Publication of CN105389338A publication Critical patent/CN105389338A/en
Application granted granted Critical
Publication of CN105389338B publication Critical patent/CN105389338B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses

Abstract

The invention discloses an analysis method of procurement bid wining data, and relates to the field of ETL (data Extract, Transform and Load) in data warehouse technology. The method comprises the following steps of: separating out standard form data and nonstandard form data in a Html procurement bid wining notice text; respectively analyzing the standard form data and the nonstandard form data according to bid wining notice attribute of the procurement bid wining notice text, and obtaining bid wining records; and storing the bid wining records obtained by analysis into a database. According to the analysis method provided by the invention, by means of separating the standard form data and the nonstandard form data in the procurement bid wining notice text, efficient and accurate analysis for the procurement bid wining data is realized, and a foundation for deep mining and utilization of the procurement bid wining data is provided.

Description

A kind of analytic method purchasing acceptance of the bid data
Technical field
The present invention relates to ETL (data pick-up, conversion and the loading) field in data warehouse technology, be specifically related to a kind of analytic method purchasing acceptance of the bid data.
Background technology
Along with the fast development of Internet technology, every day, all kinds of Internet user was at files such as a large amount of Html (HTML (Hypertext Markup Language)) document, picture and the videos of Web realease, and various reptile engine ceaselessly captures from all kinds of websites, these data of analysis and application.At present, all kinds of search engine carrys out supported web page retrieval by carrying out the process such as participle to Html text.
In government procurement field, along with government agencies at all levels' further increase government information disclosure dynamics, government website distributing data is more frequent, comprise information abundanter, but owing to lacking the support of specific transactions model and analytic method, the government procurement bulletin of department at different levels lacks consolidation form, form of presentation is different, these bulletin complete copy are just got off by existing search engine, basic inquiry service is provided by full-text search, owing to not setting up structural model, excavation and the utilization of the degree of depth cannot be carried out to the acceptance of the bid bulletin Html document captured, the result of acceptance of the bid bulletin full-text search often has a long way to go with user's request.
Acceptance of the bid record is the data of most worthy in government procurement, comprises: the attributes such as supplier, the acceptance of the bid amount of money, attached bag number, purchaser, project name, expert.The analytic method of existing general government procurement acceptance of the bid bulletin Html document safeguards a set of keyword in advance for coupling, as: the key word of supplier comprises " winning bidder ", " supplier ", " acceptance of the bid candidate ", " offerer " etc.According to underlying attribute positions such as key word location supplier, the acceptance of the bid amount of money, the first candidate, attached bags number, resolve by the keyword match method of routine, success resolution factor, often less than 50%, needs to adopt more advanced analytic method to promote resolution factor.
Summary of the invention
For the needs of the defect existed in prior art and practical application, the object of the present invention is to provide a kind of buying acceptance of the bid data efficient, accurately analytic method.
For achieving the above object, the technical solution used in the present invention is as follows:
Purchase an analytic method for acceptance of the bid data, comprise the following steps:
(1) the criteria table data in Html buying acceptance of the bid bulletin text to be resolved and non-standard list data is isolated;
(2) respectively criteria table data and non-standard list data are resolved according to the acceptance of the bid bulletins attribute of buying acceptance of the bid bulletin text, obtain record of getting the bid;
(3) be stored into resolving the acceptance of the bid record obtained in database.
Further, a kind of analytic method purchasing acceptance of the bid data as above, in step (2), described acceptance of the bid bulletins attribute comprises project name, supplier, the acceptance of the bid amount of money, purchaser and the first acceptance of the bid candidate mark.
Further, a kind of analytic method purchasing acceptance of the bid data as above, in step (1), described criteria table data refer to that the acceptance of the bid bulletins attribute of specifying in list data is arranged in the data of the same a line of form, different lines; Described acceptance of the bid bulletins attribute of specifying comprises supplier and the acceptance of the bid amount of money.
Further, a kind of analytic method purchasing acceptance of the bid data as above, in step (1), isolate the criteria table data in Html buying acceptance of the bid bulletin text to be resolved and non-standard list data, comprising:
1) all forms in Html buying acceptance of the bid bulletin text are isolated according to the form label table of Html text; All forms comprise sub-table nested in form;
2) judge whether the acceptance of the bid bulletins attribute of specifying described in form meets same a line and different lines of being positioned at form, if so, then determines that form is criteria table, if not, then determines that form is non-standard form.
Further, a kind of analytic method purchasing acceptance of the bid data as above, in step (2), criteria table data are resolved, comprising:
1. the row number of each acceptance of the bid bulletins attribute in criteria table data are obtained;
2. the every a line in circular treatment form, according to the row number of each acceptance of the bid bulletins attribute, obtains the value of each acceptance of the bid bulletins attribute of every a line, obtains the acceptance of the bid record of every a line.
Further, a kind of analytic method purchasing acceptance of the bid data as above, in step (2), adopts text string analytic method to resolve non-standard list data, comprising:
For a non-standard list data, retrieve in non-standard list data for key word with get the bid bulletins attribute or the acceptance of the bid associated prefixes of bulletins attribute or suffix, obtain the property value of each acceptance of the bid bulletins attribute, obtain according to each acceptance of the bid bulletins attribute and property value thereof record of getting the bid.
Further, a kind of analytic method purchasing acceptance of the bid data as above, in step (2), when criteria table data and non-standard list data are resolved, nesting order according to form is resolved from the data of innermost layer nested tables, after completing the parsing of one deck list data, delete the list data of respective layer.
Further, a kind of analytic method purchasing acceptance of the bid data as above, in step (1), before the criteria table data isolated in Html buying acceptance of the bid bulletin text to be resolved and non-standard list data, also comprises:
Acceptance of the bid bulletin text is purchased to Html to be resolved and carries out pre-service, delete the data irrelevant with acceptance of the bid content in Html buying acceptance of the bid bulletin text.
Further again, a kind of analytic method purchasing acceptance of the bid data as above, in step (3), before the acceptance of the bid record that parsing obtains is stored into database, also comprises:
According to the property value of acceptance of the bid bulletins attribute, judge that whether acceptance of the bid record is effective, if so, then retain this acceptance of the bid record, if not, then delete this acceptance of the bid record.
Further, a kind of analytic method purchasing acceptance of the bid data as above, in step (3), before the acceptance of the bid record that parsing obtains is stored into database, also comprises:
Belonging to acceptance of the bid record, the property value of the mark of form and its acceptance of the bid bulletins attribute judges to repeat record in acceptance of the bid record, and carries out duplicate removal process; Judgment mode is: if the property value of the identical bulletins attribute and it is got the bid of the mark of form is identical belonging to two acceptance of the bid records, then judge that two acceptance of the bid records repeat.
Beneficial effect of the present invention is: the analytic method of buying acceptance of the bid data provided by the invention, non-structured Html form buying acceptance of the bid bulletin can be converted into structurized acceptance of the bid record, this analytic method is by adopting different analysis modes to carry out being separated parsing criteria table data and non-standard list data, effectively improve resolution factor, the degree of depth for buying acceptance of the bid advertisement data excavates and utilization provides the foundation.
Accompanying drawing explanation
Fig. 1 is a kind of process flow diagram purchasing the analytic method of acceptance of the bid data in embodiment;
Fig. 2 is the process of analysis figure of embodiment Plays list data;
Fig. 3 is the process of analysis figure of non-standard list data in embodiment;
Fig. 4 is the schematic diagram of criteria table data;
Fig. 5 is the schematic diagram of non-standard list data.
Embodiment
Below in conjunction with Figure of description and embodiment, the present invention is described in further detail.
Fig. 1 shows a kind of process flow diagram purchasing the analytic method of acceptance of the bid data in the specific embodiment of the invention, and the method can comprise the following steps:
Step S100: isolate the list data in Html buying acceptance of the bid bulletin text to be resolved and non-standard list data;
First acceptance of the bid bulletin text is purchased to Html to be resolved and carry out pre-service, delete the irrelevant data with acceptance of the bid content in buying acceptance of the bid bulletin text.In the Html buying acceptance of the bid bulletin text of reality, have a lot of data irrelevant with actual acceptance of the bid content, as with text display about the data of (font, size, color etc. of text) or other do not relate to essence and to get the bid the content of data, therefore the deletion of these data can be carried out in advance, to improve the efficiency of follow-up data process.
In actual applications, only show with data about, irrelevant with acceptance of the bid content data in Html buying acceptance of the bid bulletin text can be gone out according to the display class label lookup in Html text, delete the data had nothing to do with acceptance of the bid content in Html buying acceptance of the bid bulletin text.Wherein, described display class label include but not limited to font, size and color for defining word <font> label, for define the joint in document <span> label and without implication space etc.
In present embodiment, described list data comprises criteria table data and non-standard list data; Described criteria table data refer to that the acceptance of the bid bulletins attribute of specifying in list data is arranged in the data of the same a line of form, different lines, and the acceptance of the bid bulletins attribute of specifying includes but not limited to supplier and the acceptance of the bid amount of money.List data as shown in Figure 4 is criteria table data, and in this form, offerer's title and supplier and the bid amount amount of money of namely getting the bid is positioned at same a line of form and different lines.Data outside non-standard list data and criteria table data.
In present embodiment, described acceptance of the bid bulletins attribute comprises project name, supplier, the acceptance of the bid amount of money, purchaser and the first acceptance of the bid candidate mark etc., it should be noted that, in different acceptances of the bid bulletin, the title of acceptance of the bid bulletins attribute may be different, can according to actual conditions name acceptance of the bid bulletins attribute, as supplier also may be called offerer, the acceptance of the bid amount of money may be called bid amount.
In present embodiment, the concrete mode isolating criteria table data in Html to be resolved buying acceptance of the bid bulletin text and criteria table data is:
1) all forms in Html buying acceptance of the bid bulletin text are isolated according to the form label table of Html text; All forms comprise sub-table nested in form;
2) judge whether the acceptance of the bid bulletins attribute of specifying described in form meets same a line and different lines of being positioned at form, if, then determine that form is criteria table, data in criteria table are criteria table data, if not, then determine that form is non-standard form, the data in non-standard form are non-standard list data.
In actual applications, nested <table> (<table> is containing <table>) is separated into independently N number of sub-<table>.Every sub-<table> label is all terminate with " </table> " with " <table " beginning, located and counting by key word " <table> " and " </table> ", find out nested sub-<table> label successively, be separated one by one, obtain the complete character string (list data) of each sub-<table> label, as suction parameter recursive call Data Analysis recursive algorithm.
After all (embedded with do not comprise nested) <table> tag processes, complete being separated of acceptance of the bid public text Plays list data and non-standard list data, criteria table data as shown in Figure 4, non-standard list data as shown in Figure 5.In actual applications, settle the standard form and non-standard form of the acceptance of the bid bulletins attribute of specifically specifying according to which can be selected as required, in present embodiment, by judging whether supplier and the acceptance of the bid amount of money judge whether it is criteria table in same a line, in two necessary conditions of same a line are whether:
1) in different lines: comprise cell label " </td> " between the position A of supplier and the position B of bid amount;
2) in same a line: do not comprise rower label " </tr> " between the position A of supplier and the position B of bid amount.
Form as shown in Figure 4, wherein offerer's title (supplier) and bid amount meet above-mentioned two necessary conditions, then judge that the list data in Fig. 4 is criteria table data.
Step S200: the acceptance of the bid bulletins attribute according to buying acceptance of the bid bulletin text is resolved criteria table data and non-standard list data respectively, obtains record of getting the bid;
After isolating criteria table data in text and non-standard list data, respectively criteria table data and non-standard list data are resolved.Owing to there is nest relation in list data, when criteria table data and non-standard list data are resolved, nesting order according to form is resolved from the data of innermost layer nested tables, after completing the parsing of one deck list data, delete the list data of respective layer, resolve the outer form data of this layer afterwards again.Adopt analysis mode from inside to outside, by the interference of nested tables label when can ensure outer form tag processes, with Obtaining Accurate acceptance of the bid record more.
In present embodiment, the concrete mode of resolving criteria table data as shown in Figure 2, comprises the following steps:
1. the row number of each acceptance of the bid bulletins attribute in criteria table data are obtained; Be called that the accurate profit number oriented residing for each attribute retrieved in key word in list data with the name of bulletins attribute of getting the bid, list data as shown in Figure 4, supplier's row number are the row number of 2, first candidate's mark is 5;
2. the every a line in circular treatment form, according to the row number of each acceptance of the bid bulletins attribute, obtains the value of each acceptance of the bid bulletins attribute of every a line, obtains the acceptance of the bid record of every a line.The corresponding acceptance of the bid record of every a line in criteria table data.
The second row in criteria table data as shown in Figure 4, resolves the acceptance of the bid obtained and is recorded as: supplier: Guangzhou Xingu Electronic Science and Technology Co., Ltd., and the acceptance of the bid amount of money is 246000, and this supplier is the first candidate.
In present embodiment, adopt text string analytic method to resolve non-standard list data, the flow process of parsing as shown in Figure 3, specifically comprises:
For a non-standard list data, retrieve in non-standard list data for key word with get the bid bulletins attribute or the acceptance of the bid associated prefixes of bulletins attribute or suffix, obtain the property value of each acceptance of the bid bulletins attribute, obtain an acceptance of the bid record according to each acceptance of the bid bulletins attribute and property value thereof.
In actual applications, first the title orienting supplier is needed, can retrieve in subpacket data for key word with " supplier " or " offerer ", " quoted company " etc., if can not find, then can carry out matched and searched according to the associated prefixes of supplier or suffix, such as, have special prefix such as ": " as last in the title of supplier or in title, generally have suffix such as " companies ", the retrieval location of supplier can be carried out according to these prefixes or suffix.After completing the location of supplier, parse the acceptance of the bid amount of money and other acceptance of the bid bulletins attribute more further, locate similar to supplier, the title (as " the acceptance of the bid amount of money ") that can be with bulletins attribute of getting the bid directly is retrieved for key word, if retrieve less than, can carry out searching (association suffix " volume ", " unit ", " valency " etc. as " the acceptance of the bid amount of money ") according to relevant associated prefixes or suffix.
Complete the parsing of criteria table data and non-standard list data, after obtaining acceptance of the bid record, in order to ensure the integrality recorded of getting the bid, in actual applications, other relevant information such as project name, expert that can also be recorded by conventional keyword match method acquisition acceptance of the bid.
Step S300: be stored into resolving the acceptance of the bid record obtained in database.
By the parsing in step S200, after completing the acquisition of acceptance of the bid record, acceptance of the bid data are stored in database.
Before actual storage, in acceptance of the bid data, there is the phenomenon that description repeats, the validity needing centering to mark record judges, and carries out the duplicate removal process of recording of getting the bid.
In present embodiment, during the Effective judgement of underway mark record, can, according to the property value of acceptance of the bid bulletins attribute, whether effectively judge acceptance of the bid record, if so, then retain this acceptance of the bid record, if not, then delete this acceptance of the bid record.Such as, by judging that whether supplier's checking is effective or whether the amount of money of getting the bid is 0 or whether is that the modes such as the first candidate supplier judge whether record is effective, generally, if supplier and the acceptance of the bid amount of money do not have obvious problem, then can think that an acceptance of the bid record is record of effectively getting the bid.
In present embodiment, during the duplicate removal process of underway mark record, belonging to acceptance of the bid record, the property value of the mark of form and its acceptance of the bid bulletins attribute judges to repeat record in acceptance of the bid record, and carries out duplicate removal process; Judgment mode is: if the property value of the identical bulletins attribute and it is got the bid of the mark of form is identical belonging to two acceptance of the bid records, then judge that two acceptance of the bid records repeat.Wherein, the mark of described form is used for unique identification form, in non-standard list data as shown in Figure 6, include three non-standard list datas, form belonging to three non-standard list datas mark is non-is not " bag one ", " bag two " and " wrapping three ", general, in the acceptance of the bid bulletin text of Html form, each form is with its mark, if do not had, that can give tacit consent in present embodiment distributes a unique identification number for each form.
After the validity completing acceptance of the bid record and duplicate removal process, the relevant information of record of effectively getting the bid is saved in database.
Non-structured buying acceptance of the bid can be announced by the analytic method of the buying acceptance of the bid data provided in present embodiment (Html get the bid text) is converted into structurized acceptance of the bid record and stores, the method is particularly useful for the parsing of government procurement acceptance of the bid bulletin, in practice, adopt the method effectively can identify the government procurement acceptance of the bid record of more than 90%, greatly improve efficiency and the accuracy rate of acceptance of the bid Data Analysis.
Obviously, those skilled in the art can carry out various change and modification to the present invention and not depart from the spirit and scope of the present invention.Like this, if these amendments of the present invention and modification belong within the scope of the claims in the present invention and equivalent technology thereof, then the present invention is also intended to comprise these change and modification.

Claims (10)

1. purchase an analytic method for acceptance of the bid data, comprise the following steps:
(1) the criteria table data in Html buying acceptance of the bid bulletin text to be resolved and non-standard list data is isolated;
(2) respectively criteria table data and non-standard list data are resolved according to the acceptance of the bid bulletins attribute of buying acceptance of the bid bulletin text, obtain record of getting the bid;
(3) be stored into resolving the acceptance of the bid record obtained in database.
2. a kind of analytic method purchasing acceptance of the bid data according to claim 1, is characterized in that: in step (2), and described acceptance of the bid bulletins attribute comprises project name, supplier, the acceptance of the bid amount of money, purchaser and the first acceptance of the bid candidate mark.
3. a kind of analytic method purchasing acceptance of the bid data according to claim 2, it is characterized in that: in step (1), described criteria table data refer to that the acceptance of the bid bulletins attribute of specifying in list data is arranged in the data of the same a line of form, different lines; Described acceptance of the bid bulletins attribute of specifying comprises supplier and the acceptance of the bid amount of money.
4. a kind of analytic method purchasing acceptance of the bid data according to claim 3, is characterized in that: in step (1), isolates the criteria table data in Html buying acceptance of the bid bulletin text to be resolved and non-standard list data, comprising:
1) all forms in Html buying acceptance of the bid bulletin text are isolated according to the form label table of Html text; All forms comprise sub-table nested in form;
2) judge whether the acceptance of the bid bulletins attribute of specifying described in form meets same a line and different lines of being positioned at form, if so, then determines that form is criteria table, if not, then determines that form is non-standard form.
5. a kind of analytic method purchasing acceptance of the bid data according to claim 4, is characterized in that: in step (2), resolve, comprising criteria table data:
1. the row number of each acceptance of the bid bulletins attribute in criteria table data are obtained;
2. the every a line in circular treatment form, according to the row number of each acceptance of the bid bulletins attribute, obtains the value of each acceptance of the bid bulletins attribute of every a line, obtains the acceptance of the bid record of every a line.
6. a kind of analytic method purchasing acceptance of the bid data according to claim 1, is characterized in that: in step (2), adopts text string analytic method to resolve non-standard list data, comprising:
For a non-standard list data, retrieve in non-standard list data for key word with get the bid bulletins attribute or the acceptance of the bid associated prefixes of bulletins attribute or suffix, obtain the property value of each acceptance of the bid bulletins attribute, obtain according to each acceptance of the bid bulletins attribute and property value thereof record of getting the bid.
7. according to a kind of analytic method purchasing acceptance of the bid data one of claim 4 to 6 Suo Shu, it is characterized in that: in step (2), when criteria table data and non-standard list data are resolved, nesting order according to form is resolved from the data of innermost layer nested tables, after completing the parsing of one deck list data, delete the list data of respective layer.
8. a kind of analytic method purchasing acceptance of the bid data according to claim 1, it is characterized in that: in step (1), before the criteria table data isolated in Html buying acceptance of the bid bulletin text to be resolved and non-standard list data, also comprise:
Acceptance of the bid bulletin text is purchased to Html to be resolved and carries out pre-service, delete the data irrelevant with acceptance of the bid content in Html buying acceptance of the bid bulletin text.
9. a kind of analytic method purchasing acceptance of the bid data according to claim 1, is characterized in that: in step (3), before the acceptance of the bid record that parsing obtains is stored into database, also comprises:
According to the property value of acceptance of the bid bulletins attribute, judge that whether acceptance of the bid record is effective, if so, then retain this acceptance of the bid record, if not, then delete this acceptance of the bid record.
10. a kind of analytic method purchasing acceptance of the bid data according to claim 1, is characterized in that: in step (3), before the acceptance of the bid record that parsing obtains is stored into database, also comprises:
Belonging to acceptance of the bid record, the property value of the mark of form and its acceptance of the bid bulletins attribute judges to repeat record in acceptance of the bid record, and carries out duplicate removal process; Judgment mode is: if the property value of the identical bulletins attribute and it is got the bid of the mark of form is identical belonging to two acceptance of the bid records, then judge that two acceptance of the bid records repeat.
CN201510683420.9A 2015-10-20 2015-10-20 A kind of analytic method of buying acceptance of the bid data Active CN105389338B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510683420.9A CN105389338B (en) 2015-10-20 2015-10-20 A kind of analytic method of buying acceptance of the bid data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510683420.9A CN105389338B (en) 2015-10-20 2015-10-20 A kind of analytic method of buying acceptance of the bid data

Publications (2)

Publication Number Publication Date
CN105389338A true CN105389338A (en) 2016-03-09
CN105389338B CN105389338B (en) 2018-09-04

Family

ID=55421628

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510683420.9A Active CN105389338B (en) 2015-10-20 2015-10-20 A kind of analytic method of buying acceptance of the bid data

Country Status (1)

Country Link
CN (1) CN105389338B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106250456A (en) * 2016-07-28 2016-12-21 浪潮软件集团有限公司 Bid winning announcement extraction method and device
CN107832381A (en) * 2017-10-30 2018-03-23 北京大数元科技发展有限公司 A kind of government procurement acceptance of the bid bulletin judging method and system from internet collection
CN110069622A (en) * 2017-08-01 2019-07-30 武汉楚鼎信息技术有限公司 A kind of personal share bulletin abstract intelligent extract method
CN114357054A (en) * 2022-03-10 2022-04-15 广州宸祺出行科技有限公司 Method and device for processing unstructured data based on ClickHouse

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001109741A (en) * 1999-10-13 2001-04-20 Toshiba Corp Method and system for preparing html data
US20090043629A1 (en) * 2007-08-10 2009-02-12 Kap Holdings, Llc System and method for provision of maintenance information and products
CN101576891A (en) * 2008-05-05 2009-11-11 北京瑞佳晨科技有限公司 Method for analyzing web page form object nodes
CN101908078A (en) * 2010-08-30 2010-12-08 深圳市五巨科技有限公司 Method and device for importing webpage data to EXCEL sheet
CN102222227A (en) * 2011-04-25 2011-10-19 中国华录集团有限公司 Video identification based system for extracting film images
CN104468194A (en) * 2014-11-05 2015-03-25 北京星网锐捷网络技术有限公司 Network device compatible method and forwarding server
CN104717085A (en) * 2013-12-16 2015-06-17 中国移动通信集团湖南有限公司 Log parsing method and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001109741A (en) * 1999-10-13 2001-04-20 Toshiba Corp Method and system for preparing html data
US20090043629A1 (en) * 2007-08-10 2009-02-12 Kap Holdings, Llc System and method for provision of maintenance information and products
CN101576891A (en) * 2008-05-05 2009-11-11 北京瑞佳晨科技有限公司 Method for analyzing web page form object nodes
CN101908078A (en) * 2010-08-30 2010-12-08 深圳市五巨科技有限公司 Method and device for importing webpage data to EXCEL sheet
CN102222227A (en) * 2011-04-25 2011-10-19 中国华录集团有限公司 Video identification based system for extracting film images
CN104717085A (en) * 2013-12-16 2015-06-17 中国移动通信集团湖南有限公司 Log parsing method and device
CN104468194A (en) * 2014-11-05 2015-03-25 北京星网锐捷网络技术有限公司 Network device compatible method and forwarding server

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
郭文才 等: "基于多代理和XML的供应链集成体系结构研究", 《北京理工大学学报(刹_会科学版)》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106250456A (en) * 2016-07-28 2016-12-21 浪潮软件集团有限公司 Bid winning announcement extraction method and device
CN110069622A (en) * 2017-08-01 2019-07-30 武汉楚鼎信息技术有限公司 A kind of personal share bulletin abstract intelligent extract method
CN107832381A (en) * 2017-10-30 2018-03-23 北京大数元科技发展有限公司 A kind of government procurement acceptance of the bid bulletin judging method and system from internet collection
CN114357054A (en) * 2022-03-10 2022-04-15 广州宸祺出行科技有限公司 Method and device for processing unstructured data based on ClickHouse
CN114357054B (en) * 2022-03-10 2022-06-03 广州宸祺出行科技有限公司 Method and device for processing unstructured data based on ClickHouse

Also Published As

Publication number Publication date
CN105389338B (en) 2018-09-04

Similar Documents

Publication Publication Date Title
US9280561B2 (en) Automatic learning of logos for visual recognition
CN102982153B (en) A kind of information retrieval method and device thereof
US20170177733A1 (en) Tenantization of search result ranking
US20120117051A1 (en) Multi-modal approach to search query input
CN102567329B (en) Data query method and data query system
AU740007B2 (en) Network-based classified information systems
US11561988B2 (en) Systems and methods for harvesting data associated with fraudulent content in a networked environment
CN103886022B (en) A kind of query facility and its method carrying out paging query based on major key field
CN108664637B (en) Retrieval method and system
US9323834B2 (en) Semantic and contextual searching of knowledge repositories
CN103425687A (en) Retrieval method and system based on queries
US8606780B2 (en) Image re-rank based on image annotations
Gentile et al. Unsupervised wrapper induction using linked data
US20090228430A1 (en) Multidimensional data cubes with high-cardinality attributes
CN102648466A (en) A method for retrieving a data item annotation in a view
CN105389338A (en) Analysis method of procurement bid wining data
CN101916294A (en) Method for realizing exact search by utilizing semantic analysis
US20150302090A1 (en) Method and System for the Structural Analysis of Websites
US20160034484A1 (en) Document tagging and retrieval using entity specifiers
CN109101512B (en) Construction method of legal database, legal data query method and device
Hassanzadeh et al. Helix: Online enterprise data analytics
US8799312B2 (en) Efficient label acquisition for query rewriting
CN107526795B (en) Knowledge base construction method and device, storage medium and computing equipment
CN112687403A (en) Medicine dictionary generation and medicine search method and device
JP2010272006A (en) Relation extraction apparatus, relation extraction method and program

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: 100094 2F, building 11, UFIDA Software Park, 68 Beiqing Road, Haidian District, Beijing

Patentee after: Beijing UYU Government Software Co.,Ltd.

Address before: 100094 2F, building 11, UFIDA Software Park, 68 Beiqing Road, Haidian District, Beijing

Patentee before: YONYOU GOVERNMENT AFFAIRS SOFTWARE Co.,Ltd.