CN107066626A - A kind of terminal collection file download storage, sort management method and device - Google Patents

A kind of terminal collection file download storage, sort management method and device Download PDF

Info

Publication number
CN107066626A
CN107066626A CN201710338763.0A CN201710338763A CN107066626A CN 107066626 A CN107066626 A CN 107066626A CN 201710338763 A CN201710338763 A CN 201710338763A CN 107066626 A CN107066626 A CN 107066626A
Authority
CN
China
Prior art keywords
file
storage
collection
module
terminal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201710338763.0A
Other languages
Chinese (zh)
Inventor
黄伟烈
王德宇
郭建
钟晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huizhou Desay Industry Research Institute Co Ltd
Original Assignee
Huizhou Desay Industry Research Institute Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huizhou Desay Industry Research Institute Co Ltd filed Critical Huizhou Desay Industry Research Institute Co Ltd
Priority to CN201710338763.0A priority Critical patent/CN107066626A/en
Publication of CN107066626A publication Critical patent/CN107066626A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • G06F16/972Access to data in other repository systems, e.g. legacy data or dynamic Web page generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • G06F16/9558Details of hyperlinks; Management of linked annotations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • G06F16/986Document structures and storage, e.g. HTML extensions

Abstract

The invention discloses a kind of storage of terminal collection file download, sort management method, including inquiry command, identification URL addresses are sent, web page contents is obtained, downloads storage file, generation catalogue file, storage file analysis and sort out, include entry packet and set up eight steps of hyperlink.Invention additionally discloses a kind of storage of terminal collection file download, Classification Management device, including network communication module, control module, data analysis module, data memory module and information display module.Download storage and Classification Management of the present invention applied to browser and the APP with favorite function collection file in the terminals such as computer, mobile terminal.

Description

A kind of terminal collection file download storage, sort management method and device
Technical field
The present invention relates to the terminal collection file management such as computer, mobile terminal field, and in particular to a kind of terminal collection Press from both sides file download storage, sort management method and device.
Background technology
With continuing to develop for internet, the data message amount on network is skyrocketed through, and people pass through browser, wechat point Oneself care, hobby various information can easily be read by numerous modes such as enjoying, and filtering out some web page interlinkages guarantor In the collection for being stored to oneself.
But because webpage is cancelled, or website is closed down, and causes the information in collection can not again access;It is mobile whole End, computer also cause the information of accumulation not access due to damaging or losing.In addition, in collection information it is numerous and jumbled, it is necessary to when without Method is quickly found out desired information.
Prior art is remotely to be stored for the hyperlink management of collection or collection onto far-end server, but this Method is to save a URL (Uniform Resource Locator, uniform resource locator) address, has no and preserves this The information content of address storage, the problem of such as running into foregoing results in these information and permanently lost.
The content of the invention
Present invention solves the technical problem that being the file in the terminal collection such as mobile terminal, computer is never lost Lose, the content for wanting reading how is quickly found out when needing to read again.
To achieve the above object, the present invention provides a kind of terminal collection file download storage, sort management method, including Following steps:
A. inquiry command is sent:Pass through API (the Application Programming Interface, application of terminal collection Program Interfaces) interface is to collection transmission inquiry command, and the terminal collection is the browser in terminal or has collection Press from both sides the APP of function(Application, application program)Collection;
B. URL addresses are recognized:Responded through collection, recognize the URL addresses collected in collection;
C. web page contents are obtained:The URL addresses recognized are accessed, and obtain web page contents;
D. download, storage file:Download the content in webpage and be separately stored as storage file;
E. catalogue file is generated:Obtain the file name of storage file and generate a catalogue file, using the name of storage file Referred to as described catalogue file includes entry;
F. storage file analysis, classification:The content of each storage file is analyzed, and each storage file is sorted out;
G. entry packet is included:Entry of including in catalogue file is grouped by the classification of storage file;
H. hyperlink is set up:A hyperlink is done for each entry of including, the hyperlink is pointed to and includes the corresponding storage of entry File, storage file can be directly opened by the hyperlink.
Further, the opportunity that inquiry command is sent in the step A can be according to fixed come setting time the need for user When send, or user need storage collection file when send manually.
Further, the concrete mode of the step C is to pass through HTML(Hyper Text Mark-up Language, surpass Text link indicating language)The mode of hypertext obtains webpage complete content.
Further, the form of storage file is in the step D:The suffix formed by way of webpage is separately deposited is HTML offline hypertext document, or the DOC being assembled into by the code stream informations of HTML hypertexts(Document, document)、PDF (Portable Document Format, portable document format)Deng extended formatting file.
Further, the method for the step F is the keyword and high frequency words in the content of identification storage file, by closing Keyword and high frequency words are sorted out to storage file.
Invention additionally discloses a kind of terminal collection file download storage, Classification Management device, including network communication module, Control module, data analysis module, data memory module and information display module, the control module respectively with network service mould Block, data analysis module, data memory module and the connection of information display module, the control module control network communication module, Data analysis module, data memory module and the work of information display module.
Further, the network communication module is connected with terminal, and browser with terminal or the APP that has collection are tied up Surely the URL addresses in One-to-one communication, the network communication module identification collection are realized.
Further, the data memory module, which is connected with network communication module and stores network communication module, passes through URL The file that address is downloaded.
Further, the data analysis module is connected with data memory module, the file in analyze data memory module And classify.
Further, described information display module is connected with data memory module shows the information of catalogue file.
The beneficial effect that the present invention is realized mainly have it is following some:Terminal collection disclosed by the invention is downloaded storage, divided Class management method, file in browser in terminal or APP collection is separately stored and Classification Management, having prevented and treated accident causes File lose, facilitate user search reading.Terminal collection disclosed by the invention downloads storage, Classification Management device and passes through net Network communication module is connected with the collection of browser in terminal or APP and the file download in collection is stored in into data storage In module, it is to avoid linked in collection because file is lost caused by webpage is cancelled, website is closed, terminal is damaged;Data point Classification results after the file analysis that download is stored in data memory module by analysis module are illustrated in information display module, side Just the content that the quick lookup of content in collection needs to read is read again.
Brief description of the drawings
Fig. 1 is terminal collection file download of the present invention storage, the flow chart of sort management method.
Fig. 2 is a kind of terminal collection file download storage, Classification Management apparatus module structural representation in the present invention.
Accompanying drawing being given for example only property explanation, it is impossible to be interpreted as the limitation to this patent;It is attached in order to more preferably illustrate the present embodiment Scheme some parts to have omission, zoom in or out, do not represent the size of actual product;To those skilled in the art, Some known features and its explanation may be omitted and will be understood by accompanying drawing;Same or analogous label correspondence is same or similar Part;Term the being given for example only property explanation of position relationship described in accompanying drawing, it is impossible to be interpreted as the limitation to this patent.
Embodiment
For the ease of it will be appreciated by those skilled in the art that being carried out below in conjunction with accompanying drawing and embodiment to the present invention further It is described in detail.
Embodiment 1
Accompanying drawing 1 is referred to, a kind of terminal collection file download is stored, sort management method, the terminal collection is computer Or mobile terminal, the collection is browser or the APP with favorite function, is comprised the following steps:
A. inquiry command is sent:External equipment and browser or APP with favorite function, which are set up, to be communicated, and browser or is had The APP of favorite function is designed with open api interface, and other equipment can have access to browser by open api interface Or in APP collection fileinfo;Inquiry command is sent to collection by the api interface;
B. URL addresses are recognized:Collection is received and responded after inquiry command, and if available URL addresses, external equipment just can be known The URL addresses are clipped to, if without available URL addresses, step is terminated;
C. web page contents are obtained:The URL addresses that external device access is recognized, and obtain web page contents;
D. download, storage file:Download the content in webpage and storage file is separately stored as in external equipment;
E. catalogue file is generated:External equipment obtains the file name of storage file, and generates a catalogue file, catalogue file It is referred to as including entry with the name of storage file;
F. store files analysis, classification:Each storage file full content is analyzed, each storage file is sorted out;
G. entry packet is included:Entry of including in catalogue file is grouped by the classification of correspondence storage file;
H. hyperlink is set up:A hyperlink is done for each entry, the hyperlink is pointed to the corresponding storage file of entry, passed through The hyperlink can directly open storage file.
Optionally, in abovementioned steps A, send inquiry command opportunity can according to the need for user come setting time timing Send, or sent manually when user needs storage collection file.
Optionally, in abovementioned steps C, the method for obtaining web page contents is:Webpage is obtained by way of HTML hypertexts Complete content.
Optionally, in abovementioned steps D, download, the method for storage file is:The suffix formed by way of webpage is separately deposited For HTML offline hypertext document, or the extended formatting text such as DOC, the PDF being assembled into by the code stream informations of HTML hypertexts Part.
Optionally, in abovementioned steps F, the storage file analysis, the method sorted out can be recognized in file content Keyword and high frequency words, are sorted out by the keyword and high frequency words of storage file to file.
Preceding method, completes the storage of collection file download, the file in terminal collection is separately stored in into outside and set In standby, it is to avoid in collection file because caused by webpage is cancelled, website is closed, terminal is damaged file lose;Meanwhile, it will deposit Document classification management is stored up, the quick lookup for facilitating user to read content in collection again needs the content read.
Embodiment 2
Referring to Fig. 2, invention additionally discloses a kind of storage of terminal collection file download, Classification Management device, including network are logical Module, control module, data analysis module, data memory module and information display module are believed, by the network communication module, control Molding block, data analysis module, data memory module and information display module connect compositing terminal collection file download Storage, Classification Management device.
The storage of terminal collection file download, Classification Management device are connected with terminal, and the terminal includes mobile terminal, electricity The terminals such as brain, the storage of terminal collection file download, the browser of Classification Management device and terminal or the APP for having collection are tied up It is fixed, bound browser or have collection APP can directly with the storage of terminal collection file download, Classification Management device one Communicated to one.Browser or the APP for having collection are provided with opening API interface, the storage of terminal collection file download, Classification Management The control module of device sends inquiry command by network communication module to terminal, and inquiry command is sent out by open api interface It is sent to browser or has in the APP of collection, browser or the APP for having collection are received after inquiry command, by its collection In the URL addresses of file control module is sent to by network communication module, by the address when control module receives URL addresses The webpage of collection file is accessed, and this webpage complete content is got by way of HTML hypertexts, in control module control Under system again by way of webpage is separately deposited web page contents completely download to terminal collection file download storage, Classification Management In the data memory module of device, the offline hypertext storage file that a suffix is HTML is formed, can also be super by HTML The code stream information of text, directly obtains and is directly assembled into data memory module the storage text of the forms such as DOC, a PDF Part.
The download storage of file in browser in terminal or APP collection is arrived outside terminal collection by abovementioned steps File download storage, in Classification Management device, it is to avoid file is because webpage is cancelled, website is closed, terminal is damaged in collection Caused file is lost.
After the completion of file download storage, control module can generate catalogue file in data memory module, this catalogue file Only one of which, often increases a storage file, then increases an entry information, the catalogue file increase in this catalogue file Entry information be derived from the heading message of read storage file, both catalogue file illustrated all collections of user Collection webpage heading message.A hyperlink is generated for each storage file simultaneously, the hyperlink is pointed to corresponding Storage file, that is, the storage file of all collections of user can just be shown comprehensively by this catalogue file, and by super Link sensing, which is directly invoked, opens this storage file.
In addition, data analysis module enters under control module control to the keyword and high frequency words in each storage file Row analysis, and it is sorted out according to analysis result, repeatedly " artificial intelligence " keyword is for example got by storage file Keywords such as " deep learning algorithm " " semantics recognitions ", then can be determined that this storage file belongs to scientific and technological class article;By depositing File acquisition is stored up to the keyword such as multiple " dog " keyword and " Ha Shiqi " " pet ", then can be determined that this storage file belongs to and dote on Species article.
All storage files are sorted out by these foregoing analysis methods, and according to returning in the catalogue file of generation Class carries out arrangement displaying in information display module.It so can both remove manually search cumbersome, it is necessary to read collection again from The file of needs, user-friendly and arrangement can be quickly found out during file.
It is the wherein specific implementation of the present invention above, it describes more specific and detailed, but can not therefore manage Solve as the limitation to the scope of the claims of the present invention.It should be pointed out that for the person of ordinary skill of the art, not departing from On the premise of present inventive concept, various modifications and improvements can be made, these obvious alternative forms belong to this hair Bright protection domain.

Claims (10)

1. a kind of terminal collection file download storage, sort management method, it is characterised in that comprise the following steps:
A. inquiry command is sent:Inquiry command, the terminal collection are sent to collection by the api interface of terminal collection For the browser in terminal or the collection for the APP for having favorite function;
B. URL addresses are recognized:Responded through collection, recognize the URL addresses collected in collection;
C. web page contents are obtained:The URL addresses recognized are accessed, and obtain web page contents;
D. download, storage file:Download the content in webpage and be separately stored as storage file;
E. catalogue file is generated:Obtain the file name of storage file and generate a catalogue file, using the name of storage file Referred to as described catalogue file includes entry;
F. storage file analysis, classification:The content of each storage file is analyzed, and each storage file is sorted out;
G. entry packet is included:Entry of including in catalogue file is grouped by the classification of storage file;
H. hyperlink is set up:A hyperlink is done for each entry of including, the hyperlink is pointed to and includes the corresponding storage of entry File, storage file can be directly opened by the hyperlink.
2. terminal collection file download storage according to claim 1, sort management method, it is characterised in that described The opportunity that inquiry command is sent in step A can be according to sending the need for user come setting time timing, or is needed in user Sent manually during storage collection file.
3. terminal collection file download storage according to claim 1, sort management method, it is characterised in that described Step C concrete mode is that webpage complete content is obtained by way of HTML hypertexts.
4. terminal collection file download storage according to claim 1, sort management method, it is characterised in that described The form of storage file is in step D:The suffix formed by way of webpage is separately deposited is HTML offline hypertext document, or The DOC or PDF format file being assembled into by the code stream information of HTML hypertexts.
5. terminal collection file download storage according to claim 1, sort management method, it is characterised in that described Step F method is the keyword and high frequency words in the content of identification storage file, by keyword and high frequency words to storage text Part is sorted out.
6. a kind of terminal collection file download storage, Classification Management device, it is characterised in that including network communication module, control Molding block, data analysis module, data memory module and information display module, the control module respectively with network service mould Block, data analysis module, data memory module and the connection of information display module, the control module control network communication module, Data analysis module, data memory module and the work of information display module.
7. terminal collection file download storage according to claim 6, Classification Management device, it is characterised in that described Network communication module is connected with terminal, and One-to-one communication is realized in browser with terminal or the APP bindings that have collection, described URL addresses in network communication module identification collection.
8. terminal collection file download storage according to claim 6, Classification Management device, it is characterised in that described Data memory module is connected and stored the file that network communication module is downloaded by URL addresses with network communication module.
9. terminal collection file download storage according to claim 6, Classification Management device, it is characterised in that described Data analysis module is connected with data memory module, file and classification in analyze data memory module.
10. terminal collection file download storage according to claim 6, Classification Management device, it is characterised in that described Information display module is connected with data memory module shows the information of catalogue file.
CN201710338763.0A 2017-05-15 2017-05-15 A kind of terminal collection file download storage, sort management method and device Withdrawn CN107066626A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710338763.0A CN107066626A (en) 2017-05-15 2017-05-15 A kind of terminal collection file download storage, sort management method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710338763.0A CN107066626A (en) 2017-05-15 2017-05-15 A kind of terminal collection file download storage, sort management method and device

Publications (1)

Publication Number Publication Date
CN107066626A true CN107066626A (en) 2017-08-18

Family

ID=59597439

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710338763.0A Withdrawn CN107066626A (en) 2017-05-15 2017-05-15 A kind of terminal collection file download storage, sort management method and device

Country Status (1)

Country Link
CN (1) CN107066626A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109033306A (en) * 2018-07-17 2018-12-18 佛山市灏金赢科技有限公司 A kind of browsing webpage method for sorting and system for mobile client
CN110110075A (en) * 2017-12-25 2019-08-09 中国电信股份有限公司 Web page classification method, device and computer readable storage medium
CN110245315A (en) * 2019-06-24 2019-09-17 北京向上一心科技有限公司 Method, apparatus, controller and the storage medium of APP title bar are simulated in webpage
CN111104619A (en) * 2018-10-25 2020-05-05 青岛海信移动通信技术股份有限公司 Method for collecting articles and mobile terminal

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6163779A (en) * 1997-09-29 2000-12-19 International Business Machines Corporation Method of saving a web page to a local hard drive to enable client-side browsing
JP2004078446A (en) * 2002-08-14 2004-03-11 Nec Corp Keyword extraction device, extraction method, document retrieval system, retrieval method, device and method for classifying document, and program
CN102831186A (en) * 2012-08-02 2012-12-19 深圳市同洲电子股份有限公司 Method and device for storing and searching webpage
CN103685514A (en) * 2013-12-13 2014-03-26 北京奇虎科技有限公司 Method for storing page in webpage favorite and browser
CN104573001A (en) * 2015-01-07 2015-04-29 北京联合大学 Mobile terminal-based webpage data acqusition and classification method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6163779A (en) * 1997-09-29 2000-12-19 International Business Machines Corporation Method of saving a web page to a local hard drive to enable client-side browsing
JP2004078446A (en) * 2002-08-14 2004-03-11 Nec Corp Keyword extraction device, extraction method, document retrieval system, retrieval method, device and method for classifying document, and program
CN102831186A (en) * 2012-08-02 2012-12-19 深圳市同洲电子股份有限公司 Method and device for storing and searching webpage
CN103685514A (en) * 2013-12-13 2014-03-26 北京奇虎科技有限公司 Method for storing page in webpage favorite and browser
CN104573001A (en) * 2015-01-07 2015-04-29 北京联合大学 Mobile terminal-based webpage data acqusition and classification method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
蓝色海岸: "瞧瞧懒人整理文件的大懒招", 《电脑爱好者》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110110075A (en) * 2017-12-25 2019-08-09 中国电信股份有限公司 Web page classification method, device and computer readable storage medium
CN109033306A (en) * 2018-07-17 2018-12-18 佛山市灏金赢科技有限公司 A kind of browsing webpage method for sorting and system for mobile client
CN111104619A (en) * 2018-10-25 2020-05-05 青岛海信移动通信技术股份有限公司 Method for collecting articles and mobile terminal
CN111104619B (en) * 2018-10-25 2023-09-26 青岛海信移动通信技术有限公司 Method for collecting articles and mobile terminal
CN110245315A (en) * 2019-06-24 2019-09-17 北京向上一心科技有限公司 Method, apparatus, controller and the storage medium of APP title bar are simulated in webpage

Similar Documents

Publication Publication Date Title
CN107066626A (en) A kind of terminal collection file download storage, sort management method and device
CN102930059B (en) Method for designing focused crawler
CN105608134B (en) A kind of network crawler system and its web page crawl method based on multithreading
CN103118007B (en) A kind of acquisition methods of user access activity and system
CN103685604B (en) A kind of domain name pre-parsed method and device
US20130238980A1 (en) Method and Apparatus for Processing World Wide Web Page
CN102929871A (en) Webpage browsing method and device and mobile terminal
CN107153716B (en) Webpage content extraction method and device
CN102799610A (en) Method and system for collecting network information
CN102662966B (en) Method and system for obtaining subject-oriented dynamic page content
CN102938789B (en) Download combination analysis method and device for mobile internet mobile phone applications
CN105550228A (en) Intelligent storage device and storage and take-out identification method and system based on intelligent storage device
CN102591890A (en) Method for displaying search information and search information display device
CN103123630A (en) Method, system, mobile terminal and server for obtaining webpage contents
CN109325179A (en) A kind of method and device that content is promoted
CN101441629A (en) Automatic acquiring method of non-structured web page information
CN103279503A (en) Method and system for acquiring two-dimension code information from webpage
CN104573001A (en) Mobile terminal-based webpage data acqusition and classification method
CN106686148A (en) Method and system for increasing uploading speed of objects in object storage system
CN102663074A (en) Method and device for connecting link in search result webpage
CN101008946A (en) Search method of Chinese mobile communication information and device thereof
CN103294717A (en) Web page opening method and device based on double-kernel browser
CN101916283B (en) Method for acquiring link information from dynamic webpage and server thereof
CN106528566A (en) Log file output method, server and client
CN111125485A (en) Website URL crawling method based on Scapy

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20170818

WW01 Invention patent application withdrawn after publication