CN110147477B - Web系统的数据资源模型化提取方法、装置以及设备 - Google Patents
Web系统的数据资源模型化提取方法、装置以及设备 Download PDFInfo
- Publication number
- CN110147477B CN110147477B CN201910295549.0A CN201910295549A CN110147477B CN 110147477 B CN110147477 B CN 110147477B CN 201910295549 A CN201910295549 A CN 201910295549A CN 110147477 B CN110147477 B CN 110147477B
- Authority
- CN
- China
- Prior art keywords
- web
- data
- extraction
- extraction model
- page
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000605 extraction Methods 0.000 title claims abstract description 128
- 238000000034 method Methods 0.000 claims abstract description 39
- 230000003068 static effect Effects 0.000 claims description 38
- 238000004458 analytical method Methods 0.000 claims description 27
- 238000004590 computer program Methods 0.000 claims description 13
- 238000001514 detection method Methods 0.000 claims description 11
- 238000011068 loading method Methods 0.000 claims description 8
- 238000009432 framing Methods 0.000 claims description 7
- 230000008520 organization Effects 0.000 abstract description 5
- 238000004891 communication Methods 0.000 abstract description 3
- 230000006870 function Effects 0.000 description 12
- 238000010586 diagram Methods 0.000 description 10
- 230000008569 process Effects 0.000 description 9
- 238000003860 storage Methods 0.000 description 5
- 240000006677 Vicia faba Species 0.000 description 4
- 235000010749 Vicia faba Nutrition 0.000 description 4
- 235000002098 Vicia faba var. major Nutrition 0.000 description 4
- 238000013075 data extraction Methods 0.000 description 4
- 210000001145 finger joint Anatomy 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 238000009877 rendering Methods 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012552 review Methods 0.000 description 2
- 244000046052 Phaseolus vulgaris Species 0.000 description 1
- 235000010627 Phaseolus vulgaris Nutrition 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 238000013499 data model Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000004904 shortening Methods 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0484—Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
- G06F3/04842—Selection of displayed objects or displayed text elements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/14—Tree-structured documents
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Artificial Intelligence (AREA)
- Human Computer Interaction (AREA)
- Data Mining & Analysis (AREA)
- Information Transfer Between Computers (AREA)
Abstract
Description
Claims (9)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910295549.0A CN110147477B (zh) | 2019-04-12 | 2019-04-12 | Web系统的数据资源模型化提取方法、装置以及设备 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910295549.0A CN110147477B (zh) | 2019-04-12 | 2019-04-12 | Web系统的数据资源模型化提取方法、装置以及设备 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110147477A CN110147477A (zh) | 2019-08-20 |
CN110147477B true CN110147477B (zh) | 2021-08-27 |
Family
ID=67588836
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910295549.0A Active CN110147477B (zh) | 2019-04-12 | 2019-04-12 | Web系统的数据资源模型化提取方法、装置以及设备 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110147477B (zh) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111913693B (zh) * | 2020-07-30 | 2023-11-14 | 北京数立得科技有限公司 | 一种服务接口子类模板确定方法与系统 |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8589366B1 (en) * | 2007-11-01 | 2013-11-19 | Google Inc. | Data extraction using templates |
CN103744609B (zh) * | 2014-01-20 | 2018-10-19 | 华为终端(东莞)有限公司 | 一种数据提取方法及装置 |
US20160246481A1 (en) * | 2015-02-20 | 2016-08-25 | Ebay Inc. | Extraction of multiple elements from a web page |
-
2019
- 2019-04-12 CN CN201910295549.0A patent/CN110147477B/zh active Active
Also Published As
Publication number | Publication date |
---|---|
CN110147477A (zh) | 2019-08-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20200019583A1 (en) | Systems and methods for automated repair of webpages | |
US10261984B2 (en) | Browser and operating system compatibility | |
CN107729475B (zh) | 网页元素采集方法、装置、终端与计算机可读存储介质 | |
CN110069683B (zh) | 一种基于浏览器爬取数据的方法及装置 | |
JP5756386B2 (ja) | 動的なウェブ・アプリケーションの問題を修正するメタデータの生成・管理の支援方法、装置、およびプログラム | |
US10572566B2 (en) | Image quality independent searching of screenshots of web content | |
US9934208B2 (en) | Populating visual designs with web content | |
TW201250492A (en) | Method and system of extracting web page information | |
Roy Choudhary et al. | X-PERT: a web application testing tool for cross-browser inconsistency detection | |
CN107766344B (zh) | 一种模板渲染的方法、装置及浏览器 | |
CN105095067A (zh) | 用户界面元素对象识别及自动化测试的方法和装置 | |
CN110222251B (zh) | 一种基于网页分割和搜索算法的服务包装方法 | |
CN103678511A (zh) | 根据可视化模板进行网页内容抽取的方法及装置 | |
CN112417338B (zh) | 一种页面适配方法、系统及设备 | |
CN103678509A (zh) | 生成网页模板的方法及装置 | |
Chasins et al. | Browser record and replay as a building block for end-user web automation tools | |
CN103678510A (zh) | 对网页提供可视化标注的方法及装置 | |
CN104281629A (zh) | 从网页中提取图片的方法、装置及客户端设备 | |
CN105447191B (zh) | 提供图文引导步骤的智能摘要方法及相应装置 | |
CN110147477B (zh) | Web系统的数据资源模型化提取方法、装置以及设备 | |
CN111427760A (zh) | 页面测试方法、装置、设备及存储介质 | |
Shao et al. | Webevo: taming web application evolution via detecting semantic structure changes | |
CN107622125B (zh) | 一种信息爬取方法和装置、电子设备 | |
CN105677827B (zh) | 一种表单的获取方法及装置 | |
Su et al. | KaitoroCap: A document navigation capture and visualisation tool |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP03 | Change of name, title or address |
Address after: 130117, 30th floor, Building A2, Mingyu Plaza, No. 3777 Ecological Street, Jingyue High tech Industrial Development Zone, Changchun City, Jilin Province Patentee after: Shenqi Digital Co.,Ltd. Country or region after: China Address before: No. 826, building 12345, Phoenix legend, Hanbang, Jingyue Development Zone, Changchun City, Jilin Province Patentee before: Intel Technology Co.,Ltd. Country or region before: China Address after: No. 826, building 12345, Phoenix legend, Hanbang, Jingyue Development Zone, Changchun City, Jilin Province Patentee after: Intel Technology Co.,Ltd. Country or region after: China Address before: No. 826, building 12345, Phoenix legend, Hanbang, Jingyue Development Zone, Changchun City, Jilin Province Patentee before: Changchun interui Software Co.,Ltd. Country or region before: China Address after: No. 826, building 12345, Phoenix legend, Hanbang, Jingyue Development Zone, Changchun City, Jilin Province Patentee after: Changchun interui Software Co.,Ltd. Country or region after: China Address before: 100080 room 1608, 16 / F, Haidian new technology building, 65 North Fourth Ring Road West, Haidian District, Beijing Patentee before: BEIJING INTERNETWARE Ltd. Country or region before: China |