ZA202108738B - Method, system, medium and electronic equipment for webpage main text analysis - Google Patents

Method, system, medium and electronic equipment for webpage main text analysis

Info

Publication number
ZA202108738B
ZA202108738B ZA2021/08738A ZA202108738A ZA202108738B ZA 202108738 B ZA202108738 B ZA 202108738B ZA 2021/08738 A ZA2021/08738 A ZA 2021/08738A ZA 202108738 A ZA202108738 A ZA 202108738A ZA 202108738 B ZA202108738 B ZA 202108738B
Authority
ZA
South Africa
Prior art keywords
medium
electronic equipment
text analysis
main text
webpage main
Prior art date
Application number
ZA2021/08738A
Inventor
Guomao Xin
Ruishuang Wang
Shiwei Wu
Tong Chen
Feng Lu
Chun Yang
Original Assignee
Shandong Evayinfo Tech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Evayinfo Tech Co Ltd filed Critical Shandong Evayinfo Tech Co Ltd
Publication of ZA202108738B publication Critical patent/ZA202108738B/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • G06F16/986Document structures and storage, e.g. HTML extensions
ZA2021/08738A 2021-06-28 2021-11-08 Method, system, medium and electronic equipment for webpage main text analysis ZA202108738B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110719543.9A CN113392354B (en) 2021-06-28 2021-06-28 Webpage text analysis method, system, medium and electronic equipment

Publications (1)

Publication Number Publication Date
ZA202108738B true ZA202108738B (en) 2022-01-26

Family

ID=77624199

Family Applications (1)

Application Number Title Priority Date Filing Date
ZA2021/08738A ZA202108738B (en) 2021-06-28 2021-11-08 Method, system, medium and electronic equipment for webpage main text analysis

Country Status (2)

Country Link
CN (1) CN113392354B (en)
ZA (1) ZA202108738B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115203604A (en) * 2022-09-15 2022-10-18 成都数之联科技股份有限公司 Webpage text extraction method, system, device and medium

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104268148B (en) * 2014-08-27 2018-02-06 中国科学院计算技术研究所 A kind of forum page Information Automatic Extraction method and system based on time string
CN108520007B (en) * 2018-03-15 2021-09-28 江河瑞通(北京)技术有限公司 Web page information extracting method, storage medium and computer equipment
CN108920434B (en) * 2018-06-06 2022-08-30 武汉酷犬数据科技有限公司 Universal webpage theme content extraction method and system
CN111966930B (en) * 2020-08-17 2021-05-04 山东亿云信息技术有限公司 Webpage list analyzing method and system based on XPath sequence
CN112395860A (en) * 2020-11-27 2021-02-23 山东省计算中心(国家超级计算济南中心) Large-scale parallel policy data knowledge extraction method and system
CN112230989B (en) * 2020-12-14 2021-03-12 北京智慧星光信息技术有限公司 Webpage channel navigation bar extraction method, system, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN113392354B (en) 2022-09-13
CN113392354A (en) 2021-09-14

Similar Documents

Publication Publication Date Title
SG11202100959RA (en) Data sharing method, apparatus, and system, and electronic device
EP4228178A4 (en) Communication method and apparatus, computer readable medium, and electronic device
EP3961988A4 (en) Scenario operating method and apparatus, electronic device, and computer readable medium
EP4187479A4 (en) Expression transformation method and apparatus, electronic device, and computer readable medium
SG11202008173QA (en) Webpage translation system, webpage translation apparatus, webpage providing apparatus, and webpage translation method
EP4180923A4 (en) Method for adding annotations, electronic device and related apparatus
EP4122258A4 (en) Method, device and computer readable medium for communications
ZA202108738B (en) Method, system, medium and electronic equipment for webpage main text analysis
EP4228233A4 (en) Method for adding operation sequence, electronic device, and system
EP4224830A4 (en) Data sharing method, apparatus and system, and electronic device
EP4131094A4 (en) Prediction method and apparatus, readable medium, and electronic device
EP4246343A4 (en) Information extraction method and apparatus for text with layout
EP4280799A4 (en) Data receiving method, apparatus and system
EP4089550A4 (en) Method for extracting same-structured data, and apparatus using same
EP4207842A4 (en) Address management method, apparatus and system
EP4192113A4 (en) Access information indication method, apparatus and system
EP4195639A4 (en) Service sharing method, system and electronic device
EP4187940A4 (en) Information transmission method, apparatus and system
EP4123522A4 (en) Resource recovery system, server apparatus, resource recovery apparatus, and method
EP4102440A4 (en) Prediction device, prediction system, and prediction method
SG11202008419UA (en) Method and device for fundus-image sample expansion, electronic device, and non-transitory computer readable storage medium
EP4278758A4 (en) Method, device and computer readable medium for communication
EP4303800A4 (en) Method, computer, and system
GB202016764D0 (en) Computer, system and method
ZA201902204B (en) Website access method, apparatus, and website system