CN101763419A - Method for synchronously updating remote rss data by local database - Google Patents
Method for synchronously updating remote rss data by local database Download PDFInfo
- Publication number
- CN101763419A CN101763419A CN200910255744A CN200910255744A CN101763419A CN 101763419 A CN101763419 A CN 101763419A CN 200910255744 A CN200910255744 A CN 200910255744A CN 200910255744 A CN200910255744 A CN 200910255744A CN 101763419 A CN101763419 A CN 101763419A
- Authority
- CN
- China
- Prior art keywords
- rss
- item
- source
- information
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Information Transfer Between Computers (AREA)
Abstract
The invention provides a method for synchronously updating remote rss data by a local database and belongs to the technical field of updating network database. The method comprises the following steps of: 1) analyzing all the rss sources by a content server and inputting the analyzed rss information to the local database, 2) classifying the rss information and integrating into the local database of the content server, and 3) updating the rss content by the content server. The method of the invention solves the problems that the speed of the client directly accessing the rss data is low, the classification of the rss data is complex and the operation is fussy.
Description
Affiliated technical field
The present invention relates to a kind of method of synchronously updating remote rss data by local database, belong to network data base renewal technology field.
Background technology
RSS also is clustering RSS, is a kind of easy means (also being aggregated content, Really SimpleSyndication) of online content shared.Usually use RSS to subscribe on ageing more intense content and can obtain information faster, the website provides RSS output, helps allowing the user obtain the latest update of web site contents.RSS is used in sharing information between the website, the network user can be in client by means of the polymerization tool software of supporting RSS (Sharp Reader for example, RSS Reader, NewzCrawler, Feed Demon), under the situation of not opening the web site contents page, read the web site contents of supporting RSS output.The RSS source is a kind of description and synchronous website format of content, is that present most popular XML uses.RSS has built the fast-spreading technology platform of information, makes everyone become potential informant.After issuing a RSS file, the information that comprises among this RSS Feed just can directly be called by other websites, and because these data all are the standard XML forms, thus also can other terminal and service in use.RSS is that present most popular resource sharing is used, and can be called as the extension of resource sharing pattern.
The grammer introduction of RSS: RSS is based on the form of text.It is a kind of form of XML (XML (extensible Markup Language)), and in simple terms, in fact RSS is exactly an XML file, has defined relevant DTD (Document Type Definition, document class definition).The XML data that the RSS file is exactly one section standard, this document is generally with rss, and xml or rdf are as suffix.Usually the RSS file all is to be designated as XML, and RSS file (being also referred to as RSS feeds or channels usually) only comprises simple project (item) tabulation usually.Generally speaking, each project (item) all contains a title (title), and one section simple introduce (description) also has a URL link (link is such as the address that is a webpage).And other information, for example information (author) of date (pubdate), founder or the like all can be selected.The concrete structure of RSS: the RSS2.0 file is made of a channel element and daughter element item thereof.All RSS must follow XML1.0 standard, root element<RSS〉version (version) attribute point out the RSS standard that the document is followed.The channel element is used to describe RSS feed, it has three daughter elements is necessary, be respectively<title,<description 〉,<link 〉, wherein<and title〉title in this RSS source described,<description〉be description to this channel,<link〉the URL link of channel correspondence described; Other daughter element is optional, as<image 〉,<language 〉,<category 〉,<copyright 〉,<pubdate〉etc.,<image〉defined the GIF that shows this channel, the picture of JPEG or PNG form,<language〉language that this RSS uses described,<category〉state one or more classification under this channel,<copyright〉be the copyright statement of this RSS,<pubdate〉date of describing this RSS issue.
<item〉element is most important parts in the rss document, each<channel〉element can have one or more<item〉element, each<item〉one piece of article or " story " among the element definable RSS feed.Its content also often changes, and is used for the content of display update.<item〉in the element<title 〉,<description 〉,<link〉element is necessary, wherein<title〉be used to describe this title,<description〉and be description to these clauses and subclauses,<link〉corresponding URL link described; Also have some options as<pubdate 〉,<source 〉,<author 〉,<comments 〉,<category 〉,<guid〉etc., wherein<and pubdate〉be the date of this clauses and subclauses issue,<source〉be that third party of this clauses and subclauses appointment originates,<author〉the description author information,<comments〉the permission project is connected to relevant this purpose note (file),<category〉point out this affiliated classification,<guid〉be unique identifier of this project definition.Generally, the introduction of one section item may comprise whole introductions of news, perhaps only is extra content or brief introduction.The link of these projects can both be linked to whole contents usually, can allow the up-to-date information of user's reading website content.
Rss provides the content sharing between the website, and the user can read the web site contents of supporting RSS output by subscribing to rss under the situation of not opening the web site contents page.Generally speaking, the user also can pass through rss reader or online network tool direct reading rss content.But these application needs user initiatively seeks the rss information source, and manually adds the rss information source in tabulation, complex operation, and data mix, and are not easy to user's operation.
Traditional rss subscribing manner can obtain each element information of rss by direct visit rss, at first the user sends to server and connects the rss request, server is received the request back and is connected the Internet, and the Internet returns the rss data and gives server, and last server is to user's return data.Its step is as follows:
(1), the user sends connection request to server;
(2), server goes to the Internet to connect the rss source according to the rss link information of user's submission;
(3), the Internet returns the rss data message to server;
(4), server returns rss information to the user.
This way access speed is slow, the contents processing complexity, in case and server be in the pattern of going offline, system just can not normally move, its reliability descends greatly, can not offer high-quality service for the user.As computer knowledge and the 5th the 9th phase of volume of technical journal, in March, 2009, " based on the individual info service research of RSS " described these row that promptly belong to.
Summary of the invention
For defective and the deficiency that overcomes prior art, directly visit speed that the rss data exist slowly and the problem of complicated chaotic, the complex operation of rss data qualification to solve the user, the invention provides a kind of method of synchronously updating remote rss data by local database.
Technical solution of the present invention is: directly carries out rss and subscribes at the content server end, and the rss information classification that will originate different, be incorporated into local data base, directly provide rss service by content server to the user.
Technical solution of the present invention is as follows:
A kind of method of synchronously updating remote rss data by local database, step is as follows:
1) content server is resolved all rss sources, and the rss information of resolving is put into local data base;
2) the rss information that obtains is classified, be incorporated into the local data base of content server;
3) the content server end carries out the rss content update.
Above-mentioned steps 1) described content server is resolved all rss sources, and the rss information of resolving is put into local data base, and concrete steps are as follows:
(1) the XML_RSS object: $rss=﹠amp of the some rss of generation source correspondence; New XML_RSS ($url); Url is the link of this rss source correspondence;
(2) resolve this rss source: $rss-〉parse ();
(3) obtain all item:$items=$rss-in this rss source〉getItems ();
(4) all rss are resolved, and the information in each rss source is all left in the tables of data, wherein in local data base, set up a tables of data, to identify different rss source information for each rss source.
Usually, we extract title, description, link, the pubdate element of item, and classify according to title element and description element, by with the rss source under channel carry out fuzzy matching classification, deposit in the different pieces of information table in the local data base.
Above-mentioned steps 2) described the rss information that obtains is classified, be incorporated into the local data base of content server, concrete steps are as follows:
(1) determines according to channel classification information which the classification of rss information has, wherein set up a tables of data, to identify different channel datas for each channel;
(2) by parsing obtain<title,<description determine which classification this item belongs to;
(3), obtain the full text information of this item according to the link information that obtains; And this item joined in the respective classified;
(4) all item are classified.
Above-mentioned steps (2) described by parsing obtain<title,<description determine this item belong to which the classification, concrete steps are as follows:
1. at first right<title〉with<description different weights are set;
2. carry out the column The matching analysis with the column under definite this item;
3. carry out the channel analysis at a certain column, the channel classification under determining;
4. judge whether this item has been present in this classification, if, do not carry out tables of data and insert operation, otherwise with this item insert in the tables of data of corresponding channel classification.
For guaranteeing the ageing of rss information, need carry out the rss content update at the content server end;
Above-mentioned steps 3) described content server end carries out the rss content update, and concrete steps are as follows:
(1) if first item in a certain rss source has been present in the database, then stop to upgrade, otherwise continue to extract the item in this source, be inserted in the tables of data of this source correspondence, and it is classified;
(2) upgrading rss successively has been present in the tables of data until the item that is extracted;
(3) all rss sources are upgraded.
Above-mentioned steps 3) the content server end upgrades the rss source in, can to different rss sources different renewal frequencies be set according to the renewal speed in different rss source.
Above-mentioned steps (3) is described to be upgraded all rss sources, and concrete steps are as follows:
1. to a certain rss source item<pubdate (issuing time) analyze, and determines the similarity of issuing time between the adjacent item, sets up the similarity vector table;
2. determine the renewal frequency in a certain rss source according to the similarity vector table;
3. set up the similarity vector table in all rss sources, determine the renewal frequency in all rss sources.
The inventive method system for use in carrying by the terminal of user side, deposit RSS information content server, provide 3 parts in the Internet in RSS source to form, wherein content server is integrated classification with RSS source information data different on the Internet, leave in the local data base, and provide subscription service to the user.Synchronously updating remote rss data by local database can be divided into 2 functional modules: database update module, information subscribing module.
Content server has extracted long-range rss information data and has carried out classification and storage on the content server of this locality, and system can export different rss information to the user according to the information of customization, and the user need not to know the source of these contents.Because these rss information are to be stored in local content server, this has improved user's access speed, has saved user's time.This method avoids user side directly to connect the rss source, thereby improves user's access speed, and disconnects under the situation about being connected with the rss source at content server, and system still can work, and avoids the service disruption that causes owing to the network failure reason.At the content server end, we carry out regular update to the information in rss source, thereby guarantee the ageing of user profile, and for the user provides quality services, the user only need subscribe to the rss service that corresponding information just can obtain belonging on the network this information classification.
The RSS subscribing manner of background technology is, the user subscribes to RSS information by the RSS reader, can be obtained each element information of rss by direct visit rss: at first the user sends and connects the RSS request, server is received the request back and is connected the Internet, the Internet returns the rss data and gives server, and last server returns the RSS data to the user.And the method for synchronously updating remote rss data by local database of the present invention is separated the user by content server with the Internet, content server obtains the information in rss source from the Internet, the user only need submit to content server and subscribe to, sends the information service that browse request just can obtain coming from the Internet.
The inventive method has solved directly carries out slow this problem of RSS subscription access speed, this technology is subscribed in conjunction with traditional RSS, carrying out RSS at the content server end subscribes to, resolve the RSS source, and the RSS information of resolving integrated classification, and be stored in local data base, optimized data memory format, promptly increase an intermediary service layer, to improve access speed at the content server end.
Description of drawings
Fig. 1 is the process flow diagram of the inventive method; Wherein 1) be its each step-3).
Fig. 2 is the process flow diagram of the concrete steps of step 1) shown in Fig. 1; Wherein (1)-(4) are its each step.
Fig. 3 is a step 2 shown in Fig. 1) the process flow diagram of concrete steps; Wherein a-d is its each step.
Fig. 4 is the process flow diagram of the concrete steps of the b of step shown in Fig. 3; Wherein e-h is its each step.
Fig. 5 is the process flow diagram of the concrete steps of step 3) shown in Fig. 1; Wherein i-k is its each step.
Fig. 6 is the process flow diagram of the concrete steps of the k of step shown in Fig. 5; Wherein 1-n is its each step.
Embodiment
The invention will be further described below in conjunction with drawings and Examples, but be not limited thereto.
Embodiment:
A kind of method of synchronously updating remote rss data by local database, as shown in Figure 1, step is as follows:
1) content server is resolved all rss sources, and the rss information of resolving is put into local data base;
2) the rss information that obtains is classified, be incorporated into the local data base of content server;
3) the content server end carries out the rss content update.
Above-mentioned steps 1) described content server is resolved all rss sources, and the rss information of resolving is put into local data base, and as shown in Figure 2, concrete steps are as follows:
(1) the XML_RSS object: $rss=﹠amp of the some rss of generation source correspondence; New XML_RSS ($url); Url is the link of this rss source correspondence;
(2) resolve this rss source: $rss-〉parse ();
(3) obtain all item:$items=$rss-in this rss source〉getItems ();
(4) all rss are resolved, and the information in each rss source is all left in the tables of data, wherein in local data base, set up a tables of data, to identify different rss source information for each rss source.
Above-mentioned steps 2) described the rss information that obtains is classified, be incorporated into the local data base of content server, as shown in Figure 3, concrete steps are as follows:
A determines according to channel classification information which the classification of rss information has, and wherein sets up a tables of data for each channel, to identify different channel datas;
B is obtained<title by parsing 〉,<description〉determine which classification this item belongs to;
C obtains the full text information of this item according to the link information that obtains; And this item joined in the respective classified;
D classifies to all item.
Above-mentioned steps b described by parsing obtain<title,<description determine this item belong to which the classification, as shown in Figure 4, concrete steps are as follows:
E is at first right<title with<description different weights are set;
F carries out the column The matching analysis with the column under definite this item;
G carries out the channel analysis at a certain column, the channel classification under determining;
H judges whether this item has been present in this classification, if, do not carry out tables of data and insert operation, otherwise with this item insert in the tables of data of corresponding channel classification.
Above-mentioned steps 3) described content server end carries out the rss content update, and as shown in Figure 5, concrete steps are as follows:
I then stops to upgrade if first item in a certain rss source has been present in the database, otherwise otherwise continue to extract the item in this source, be inserted in the tables of data of this source correspondence, and it is classified;
J upgrades rss successively and has been present in the tables of data until the item that is extracted;
K upgrades all rss sources.
Above-mentioned steps k is described to be upgraded all rss sources, and as shown in Figure 6, concrete steps are as follows:
1 couple of a certain rss source item<pubdate〉(issuing time) analyze, and determines the similarity of issuing time between the adjacent item, sets up the similarity vector table;
M determines the renewal frequency in a certain rss source according to the similarity vector table;
N sets up the similarity vector table in all rss sources, determines the renewal frequency in all rss sources.
Claims (6)
1. the method for a synchronously updating remote rss data by local database, step is as follows:
1) content server is resolved all rss sources, and the rss information of resolving is put into local data base;
2) the rss information that obtains is classified, be incorporated into the local data base of content server;
3) the content server end carries out the rss content update.
2. resolve all rss sources as the described content server of step 1) in the claim 1, the rss information of resolving is put into local data base, concrete steps are as follows:
(1) the XML_RSS object: $rss=﹠amp of the some rss of generation source correspondence; New XML_RSS ($url); Url is the link of this rss source correspondence;
(2) resolve this rss source: $rss->parse ();
(3) obtain all item:$items=$rss->getItems () in this rss source;
(4) all rss sources are resolved, and the information in each rss source is all left in the tables of data, wherein in local data base, set up a tables of data, to identify different rss source information for each rss source.
3. as step 2 in the claim 1) described the rss information that obtains is classified, be incorporated into the local data base of content server, concrete steps are as follows:
(1) determines according to channel classification information which the classification of rss information has, wherein set up a tables of data, to identify different channel datas for each channel;
(2) by parsing obtain<title,<description determine which classification this item belongs to;
(3), obtain the full text information of this item according to the link information that obtains; And this item joined in the respective classified;
(4) all item are classified.
As step in the claim 3 (2) described by parsing obtain<title,<description determine this item belong to which the classification, concrete steps are as follows:
1. at first right<title〉with<description different weights are set;
2. carry out the column The matching analysis, with the column under definite this item;
3. carry out the channel analysis at a certain column, the channel classification under determining;
4. judge whether this item has been present in this classification, if, do not carry out tables of data and insert operation, otherwise with this item insert in the tables of data of corresponding channel classification.
5. carry out the rss content update as the described content server end of step 3) in the claim 1, concrete steps are as follows:
(1) if first item in a certain rss source has been present in the database, then stop to upgrade, otherwise continue to extract the item in this source, be inserted in the tables of data of this source correspondence, and it is classified;
(2) upgrading rss successively has been present in the tables of data until the item that is extracted;
(3) all rss sources are upgraded.
The content server end upgrades the rss source in the step 3), can to different rss sources different renewal frequencies be set according to the renewal speed in different rss source.
6. as step in the claim 5 (3) is described all rss sources are upgraded, are implemented as follows:
1. to a certain rss source item<pubdate (issuing time) analyze, and determines the similarity of issuing time between the adjacent item, sets up the similarity vector table;
2. determine the renewal frequency in a certain rss source according to the similarity vector table;
3. set up the similarity vector table in all rss sources, determine the renewal frequency in all rss sources.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN200910255744A CN101763419A (en) | 2009-12-28 | 2009-12-28 | Method for synchronously updating remote rss data by local database |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN200910255744A CN101763419A (en) | 2009-12-28 | 2009-12-28 | Method for synchronously updating remote rss data by local database |
Publications (1)
Publication Number | Publication Date |
---|---|
CN101763419A true CN101763419A (en) | 2010-06-30 |
Family
ID=42494583
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN200910255744A Pending CN101763419A (en) | 2009-12-28 | 2009-12-28 | Method for synchronously updating remote rss data by local database |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN101763419A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2012016404A1 (en) * | 2010-08-05 | 2012-02-09 | 中兴通讯股份有限公司 | Really simple syndication subscription method and client thereof |
CN102779146A (en) * | 2012-04-26 | 2012-11-14 | 新奥特(北京)视频技术有限公司 | Method and system for updating data in local database in real time |
CN102799602A (en) * | 2012-04-26 | 2012-11-28 | 新奥特(北京)视频技术有限公司 | Method and system for acquiring data from Internet |
CN103207859A (en) * | 2012-01-11 | 2013-07-17 | 北京四维图新科技股份有限公司 | Method and device for integrating databases |
WO2013185587A1 (en) * | 2012-06-11 | 2013-12-19 | 腾讯科技(深圳)有限公司 | Information syndication file synchronizing method, device and system |
CN108665654A (en) * | 2018-05-18 | 2018-10-16 | 任飞翔 | Cash register information synchronization method and cash register system |
-
2009
- 2009-12-28 CN CN200910255744A patent/CN101763419A/en active Pending
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2012016404A1 (en) * | 2010-08-05 | 2012-02-09 | 中兴通讯股份有限公司 | Really simple syndication subscription method and client thereof |
CN103207859A (en) * | 2012-01-11 | 2013-07-17 | 北京四维图新科技股份有限公司 | Method and device for integrating databases |
CN103207859B (en) * | 2012-01-11 | 2016-07-06 | 北京四维图新科技股份有限公司 | The method and apparatus of integrated database |
CN102779146A (en) * | 2012-04-26 | 2012-11-14 | 新奥特(北京)视频技术有限公司 | Method and system for updating data in local database in real time |
CN102799602A (en) * | 2012-04-26 | 2012-11-28 | 新奥特(北京)视频技术有限公司 | Method and system for acquiring data from Internet |
CN102799602B (en) * | 2012-04-26 | 2018-03-16 | 新奥特(北京)视频技术有限公司 | A kind of method and system that data are obtained from internet |
WO2013185587A1 (en) * | 2012-06-11 | 2013-12-19 | 腾讯科技(深圳)有限公司 | Information syndication file synchronizing method, device and system |
CN108665654A (en) * | 2018-05-18 | 2018-10-16 | 任飞翔 | Cash register information synchronization method and cash register system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN100353733C (en) | RSS message interactive processing method based on XML file | |
CN101286169B (en) | Client end management for coordinating content downloading order | |
CN101375247B (en) | Service creating method, for realizing computer program and the computer system of described method | |
CN100444174C (en) | Method for picking-up, and aggregating micro content of web page, and automatic updating system | |
CN107357933B (en) | Label description method and device for multi-source heterogeneous scientific and technological information resources | |
CN100430939C (en) | Method and system for client-side manipulation of tables | |
KR102138896B1 (en) | Method for providing online to offline based multiplatform making service combining socialmedia, marketing and e-commerce | |
CN101763419A (en) | Method for synchronously updating remote rss data by local database | |
CN101196899B (en) | Method and system for processing the input in an XML form | |
CN101290624B (en) | News web page metadata automatic extraction method | |
CN107678943B (en) | Page automatic testing method of abstract page object | |
CN101997927A (en) | Method and system for caching data of WEB platform | |
CN109815382B (en) | Method and system for sensing and acquiring large-scale network data | |
CN102279894A (en) | Method for searching, integrating and providing comment information based on semantics and searching system | |
US20060253773A1 (en) | Web-based client/server interaction method and system | |
AU2014400621B2 (en) | System and method for providing contextual analytics data | |
CN110263009A (en) | Generation method, device, equipment and the readable storage medium storing program for executing of log classifying rules | |
CN102880683A (en) | Automatic network generation system for feasibility study report and generation method thereof | |
NL1025547C2 (en) | Content management portal and method for managing digital values. | |
CN109446042A (en) | A kind of blog management method and system for intelligent power equipment | |
US20180315092A1 (en) | Server For Providing Internet Content and Computer-Readable Recording Medium Including Implemented Internet Content Providing Method | |
CN109284469B (en) | Webpage development framework | |
KR20090047756A (en) | System and method for providing internet users with individually customized rss service | |
US9047300B2 (en) | Techniques to manage universal file descriptor models for content files | |
US20080281828A1 (en) | Variable Data Replacement Technique For An Electronic Communication System |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C12 | Rejection of a patent application after its publication | ||
RJ01 | Rejection of invention patent application after publication |
Open date: 20100630 |