CN102799602B - A kind of method and system that data are obtained from internet - Google Patents
A kind of method and system that data are obtained from internet Download PDFInfo
- Publication number
- CN102799602B CN102799602B CN201210126411.6A CN201210126411A CN102799602B CN 102799602 B CN102799602 B CN 102799602B CN 201210126411 A CN201210126411 A CN 201210126411A CN 102799602 B CN102799602 B CN 102799602B
- Authority
- CN
- China
- Prior art keywords
- xml file
- rss
- xml
- target database
- stored
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Landscapes
- Information Transfer Between Computers (AREA)
Abstract
The invention discloses a kind of method and system that data are obtained from internet, methods described specifically includes:Expandable mark language XML file is obtained from network data provider;Whether the XML file that judgement is got is legal, if legal, analyzes the XML file, if meeting the reference format of really simple syndication (RSS), the XML file is RSS format, is otherwise off-gauge RSS format;Otherwise, the XML file is obtained from network data provider again;The XML file is stored in target database according to different-format adaptability.The present invention can include RSS and off-gauge RSS XML file by Intelligent Recognition different-format from internet, store in target database, improve the flexibility that data are obtained from internet, provide the user more convenient and real-time Internet resources.
Description
Technical field
It is more particularly to a kind of to obtain the method for data from internet and be the present invention relates to technical field of Internet information
System.
Background technology
Due to the rapid development of information technology, the world has come into the epoch of information, and information is numerous and jumbled, due to information
Utilization can be provided for some crowds, thus is considered as a kind of resource, these can provide the information utilized by title information.So-called information is broadcast
Go out system, be also image-text information broadcasting system, be for traditional television broadcast system.Traditional video broadcasts system
System be all using broadcast activity television image and sound accompaniment as main task, and information broadcasting system be using word, figure, chart as
It is major-minor with dynamic image, the system for propagating various information.It can (Info channel, TV be purchased with one television channel of complete independently
Thing channel) broadcast, can also be attached in traditional broadcast system, increase the broadcast information content of channel.Existing information is broadcasted
System has following characteristic:1st, picture, video, upper rolling, Zuo Fei, animation footmark broadcast 2, multirow information real time modifying with screen
3, all kinds of TV column templates of customizing are broadcasted in real time, and column packaging directly applies mechanically 4, be board-like versatile and flexible, can arbitrarily set
Multiple advertisement positions 5, infinite Layer captions are superimposed in 6, advertisement show window in real time can add a large amount of display advertising information and animation file, and
And can there are title and text information 7 in every advertising message, Financial Information, exchange rate window, stock market's wind and cloud, day can be broadcasted simultaneously
Gas forecast etc..The data broadcasted in information broadcasting system obtain from network data provider.
Extensible markup language (Extensible Markup Language, XML), for marking e-file to make its tool
Have structural markup language, can be used for flag data, define data type, be it is a kind of permission user to the mark language of oneself
Say the original language being defined.XML is standard generalized markup language (SGML) subset, is especially suitable for Web transmission.XML is provided
Unified method describes and exchanged the structural data independently of application program or supplier.
Wherein, RSS is one of form of XML file, and RSS (Simple Syndication, being also aggregated content) is a kind of description
With the form of synchronous web site contents.RSS can be one of them of following three explanations:Really Simple
Syndication;RDF(Resource Description Framework)Site Summary;Rich Site
Summary.But these three explain the technology all referring to same Syndication in fact.RSS is now widely used in cyber journalism
Channel, blog and wiki, main version have 0.91,1.0,2.0.Information can quickly be obtained by being subscribed to using RSS, and website provides
RSS is exported, and is advantageous to the latest update for allowing user to obtain web site contents.
During the present invention is realized in inventor, there is following defect in discovery in the prior art:Obtained from internet
When taking XML file, subscription acquisition can only be carried out to the data of single form, it is impossible to while the data of multiple format are known
Not.
The content of the invention
For in the prior art the defects of, the present invention can from internet Intelligent Recognition different-format include RSS with it is non-
The RSS of standard XML file, improve the flexibility that data are obtained from internet, provide the user it is more convenient and
Real-time Internet resources.
In order to solve above technical problem the invention provides a kind of method that data are obtained from internet, specifically include:
Expandable mark language XML file is obtained from network data provider;
Whether the XML file that judgement is got is legal, if legal, the XML file is analyzed, if meeting in polymerization
Hold RSS reference format, then the XML file is RSS format, is otherwise off-gauge RSS format;Otherwise, again from network
Metadata provider obtains the XML file;
The XML file is stored in target database according to different-format adaptability, specifically included:
When the form of the XML file is RSS, it is stored in after parsing in the target database;Or, when XML texts
When the form of part is off-gauge RSS, it is directly stored in the target database.
Wherein, it is described to obtain expandable mark language XML file from network data provider, specifically include:
The XML addresses are imported with parametric form according to user's request;
Analyze the corresponding URL link of XML address acquisitions;
The XML file is obtained by reading the URL link.
Wherein, judge that whether the XML file that gets is legal, specifically includes:
It is whether legal according to the XML file that the judgement of XML syntactic properties is got.
Wherein, it is described to be stored in when the form of the XML file is RSS after parsing in the target database, specific bag
Include:
When the form of the XML file is RSS, the target database T_XmlRss is stored in linescan method after parsing
In table.
It is wherein, described to be directly stored in when the form of the XML file is off-gauge RSS in the target database,
Specifically include:
When the form of the XML file is off-gauge RSS, the XML is directly stored in the target database T_
In XmlOriginal tables.
Present invention also offers a kind of system that data are obtained from internet, specifically include:
Acquiring unit, for obtaining expandable mark language XML file from network data provider;
Whether judging unit, the XML file for judging to get are legal;
Analytic unit, for analyzing the XML file, if meeting the reference format of really simple syndication (RSS), the XML texts
Part is RSS format, is otherwise off-gauge RSS format;
Memory cell, for the XML file adaptability of different-format to be stored in into target database, wherein, also specifically include:
Resolution unit, for when the form of the XML file is RSS, being stored in after parsing in the target database;Or, when described
When the form of XML file is off-gauge RSS, it is directly stored in the target database.
Wherein, the acquiring unit specifically includes import unit, analytic unit and reading unit, wherein,
Import unit, for importing the XML addresses according to user's request with parametric form;
Analytic unit, for analyzing the corresponding URL link of XML address acquisitions;
Reading unit, for obtaining the XML file by reading the URL link.
Wherein, judging unit is specifically used for:
It is whether legal according to the XML file that the judgement of XML syntactic properties is got.
Wherein, it is described to be stored in when the form of the XML file is RSS after parsing in the target database, specific bag
Include:
When the form of the XML file is RSS, the target database T_XmlRss is stored in linescan method after parsing
In table.
It is wherein, described to be directly stored in when the form of the XML file is off-gauge RSS in the target database,
Specifically include:
When the form of the XML file is off-gauge RSS, the XML is directly stored in the target database T_
In XmlOriginal tables.
Compared with prior art, the embodiment of the present invention has advantages below:Pass through the Intelligent Recognition not apposition from internet
Formula includes RSS and off-gauge RSS XML file, stores in target database, number is obtained from internet so as to improve
According to flexibility, provide the user more convenient and real-time Internet resources.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing
There is the required accompanying drawing used in technology description to be briefly described, it should be apparent that, drawings in the following description are only this
Some embodiments of invention, for those of ordinary skill in the art, without having to pay creative labor, may be used also
To obtain other accompanying drawings according to these accompanying drawings.
Fig. 1:It is a kind of flow chart for the method that data are obtained from internet in the embodiment of the present invention 1;
Fig. 2:It is a kind of structure chart for the system that data are obtained from internet in the embodiment of the present invention 2.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete
Site preparation describes, it is clear that described embodiment is part of the embodiment of the present invention, rather than whole embodiments.Based on the present invention
In embodiment, the every other implementation that those of ordinary skill in the art are obtained under the premise of creative work is not made
Example, belongs to the scope of protection of the invention.
A kind of method that data are obtained from internet is provided in the embodiment of the present invention 1, as shown in figure 1, including following step
Suddenly:
Step S101, expandable mark language XML file is obtained from network data provider, is specifically included:
The XML addresses are imported with parametric form according to user's request, multiple XML addresses space-separated, if desired
Username and password, then with CSV, such as ' xmlReader.exehttp:\\rss.sina.com.cn\
sports.xmlhttp:Singapore.info.afg.xml, user, pass ';
Analyze the corresponding URL link of XML address acquisitions;
The XML file is obtained by reading the URL link.
Step S102, judge that whether the XML file that gets is legal, specifically includes:
It is whether legal according to the XML file that the judgement of XML syntactic properties is got, if legal, implementation steps S103,
If illegal, the XML file is obtained from network data provider again.
Step S103, the XML file is analyzed, if meeting the reference format of really simple syndication (RSS), the XML file is
RSS format, it is otherwise off-gauge RSS format.
Step S104, the XML file is stored in target database according to different-format adaptability, specifically included:
When the form of the XML file is RSS, it is stored in after parsing in the target database;Or, when XML texts
When the form of part is off-gauge RSS, it is directly stored in the target database,
Wherein, it is described to be stored in when the form of the XML file is RSS after parsing in the target database, specific bag
Include:
When the form of the XML file is RSS, the target database T_XmlRss is stored in linescan method after parsing
In table;
It is wherein, described to be directly stored in when the form of the XML file is off-gauge RSS in the target database,
Specifically include:
When the form of the XML file is off-gauge RSS, the XML is directly stored in the target database T_
In XmlOriginal tables.
What the technical scheme of the embodiment of the present invention was brought has the beneficial effect that:Pass through the Intelligent Recognition not apposition from internet
Formula includes RSS and off-gauge RSS XML file, is stored in target database, number is obtained from internet so as to improve
According to flexibility, provide the user more convenient and real-time Internet resources.
A kind of system that data are obtained from internet is provided in the embodiment of the present invention 2, as shown in Fig. 2 including:
Acquiring unit 201, for obtaining expandable mark language XML file from network data provider;
Wherein, the acquiring unit specifically includes import unit, analytic unit and reading unit, wherein,
Import unit 2011, for importing the XML addresses according to user's request with parametric form;
Analytic unit 2012, for analyzing the corresponding URL link of XML address acquisitions;
Reading unit 2013, for obtaining the XML file by reading the URL link.
Judging unit 202, whether the XML file for judging to get is legal, is specially:
It is whether legal according to the XML file that the judgement of XML syntactic properties is got.
Analytic unit 203, for analyzing the XML file, if meeting the reference format of really simple syndication (RSS), the XML
File is RSS format, is otherwise off-gauge RSS format.
Memory cell 204, for the XML file adaptability of different-format to be stored in into target database, wherein, also specific bag
Include:Resolution unit 2031, for when the form of the XML file is RSS, being stored in after parsing in the target database;Or,
When the form of the XML file is off-gauge RSS, it is directly stored in the target database,
Wherein, it is described to be stored in when the form of the XML file is RSS after parsing in the target database, specific bag
Include:
When the form of the XML file is RSS, the target database T_XmlRss is stored in linescan method after parsing
In table.
It is wherein, described to be directly stored in when the form of the XML file is off-gauge RSS in the target database,
Specifically include:
When the form of the XML file is off-gauge RSS, the XML is directly stored in the target database T_
In XmlOriginal tables.
What the technical scheme of the embodiment of the present invention was brought has the beneficial effect that:Pass through the Intelligent Recognition not apposition from internet
Formula includes RSS and off-gauge RSS XML file, is stored in target database, number is obtained from internet so as to improve
According to flexibility, provide the user more convenient and real-time Internet resources.
Through the above description of the embodiments, those skilled in the art can be understood that the present invention can lead to
Hardware realization is crossed, the mode of necessary general hardware platform can also be added by software to realize.Based on such understanding, this hair
Bright technical scheme can be embodied in the form of software product, and the software product can be stored in a non-volatile memories
In medium (can be CD-ROM, USB flash disk, mobile hard disk etc.), including some instructions are causing a computer equipment (can be
Personal computer, server, or network equipment etc.) perform method described in each embodiment of the present invention.
It will be appreciated by those skilled in the art that accompanying drawing is the schematic diagram of a preferred embodiment, module or stream in accompanying drawing
Journey is not necessarily implemented necessary to the present invention.
It will be appreciated by those skilled in the art that the module in device in embodiment can describe be divided according to embodiment
It is distributed in the device of embodiment, respective change can also be carried out and be disposed other than in one or more devices of the present embodiment.On
The module for stating embodiment can be merged into a module, can also be further split into multiple submodule.
The embodiments of the present invention are for illustration only, do not represent the quality of embodiment.
Disclosed above is only several specific embodiments of the present invention, and still, the present invention is not limited to this, any ability
What the technical staff in domain can think change should all fall into protection scope of the present invention.
Claims (8)
- A kind of 1. method that data are obtained from internet, it is characterised in that including:Expandable mark language XML file is obtained from network data provider, is specifically included:According to user's request with parametric form Import the XML addresses;Analyze the corresponding URL link of XML address acquisitions;Obtained by reading the URL link To the XML file;Whether the XML file that judgement is got is legal, if legal, the XML file is analyzed, if meeting aggregated content RSS reference format, then the XML file is RSS format, is otherwise off-gauge RSS format;Otherwise, again from network number The XML file is obtained according to provider;The XML file is stored in target database according to different-format adaptability, specifically included:When the form of the XML file is RSS, it is stored in after parsing in the target database;Or, when the XML file When form is off-gauge RSS, it is directly stored in the target database.
- 2. the method as described in claim 1, it is characterised in that judge whether the XML file that gets is legal, specific bag Include:It is whether legal according to the XML file that the judgement of XML syntactic properties is got.
- 3. the method as described in claim 1, it is characterised in that it is described when the form of the XML file is RSS, after parsing It is stored in the target database, specifically includes:When the form of the XML file is RSS, the target database T_XmlRss tables are stored in linescan method after parsing In.
- 4. the method as described in claim 1, it is characterised in that described when the form of the XML file is off-gauge RSS When, it is directly stored in the target database, specifically includes:When the form of the XML file is off-gauge RSS, the XML is directly stored in the target database T_ In XmlOriginal tables.
- A kind of 5. system that data are obtained from internet, it is characterised in that including:Acquiring unit, for obtaining expandable mark language XML file from network data provider;Whether judging unit, the XML file for judging to get are legal;Analytic unit, for analyzing the XML file, if meeting the reference format of really simple syndication (RSS), the XML file is RSS format, it is otherwise off-gauge RSS format;Memory cell, for the XML file adaptability of different-format to be stored in into target database, wherein, also specifically include:Parsing Unit, for when the form of the XML file is RSS, being stored in after parsing in the target database;Or, when XML texts When the form of part is off-gauge RSS, it is directly stored in the target database;The acquiring unit specifically includes import unit, analytic unit and reading unit, wherein,Import unit, for importing the XML addresses according to user's request with parametric form;Analytic unit, for analyzing the corresponding URL link of XML address acquisitions;Reading unit, for obtaining the XML file by reading the URL link.
- 6. system as claimed in claim 5, it is characterised in that judging unit is specifically used for:It is whether legal according to the XML file that the judgement of XML syntactic properties is got.
- 7. system as claimed in claim 5, it is characterised in that it is described when the form of the XML file is RSS, after parsing It is stored in the target database, specifically includes:When the form of the XML file is RSS, the target database T_XmlRss tables are stored in linescan method after parsing In.
- 8. system as claimed in claim 5, it is characterised in that described when the form of the XML file is off-gauge RSS When, it is directly stored in the target database, specifically includes:When the form of the XML file is off-gauge RSS, directly The XML is stored in the target database T_XmlOriginal tables.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210126411.6A CN102799602B (en) | 2012-04-26 | 2012-04-26 | A kind of method and system that data are obtained from internet |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210126411.6A CN102799602B (en) | 2012-04-26 | 2012-04-26 | A kind of method and system that data are obtained from internet |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102799602A CN102799602A (en) | 2012-11-28 |
CN102799602B true CN102799602B (en) | 2018-03-16 |
Family
ID=47198714
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201210126411.6A Expired - Fee Related CN102799602B (en) | 2012-04-26 | 2012-04-26 | A kind of method and system that data are obtained from internet |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102799602B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1672523A2 (en) * | 2004-12-20 | 2006-06-21 | Microsoft Corporation | Method and system for linking data ranges of a computer-generated document with associated extensible markup language elements |
CN2852542Y (en) * | 2005-11-07 | 2006-12-27 | 国网北京电力建设研究院 | Wireless communication base station for transmission line monitoring |
CN101739421A (en) * | 2008-11-21 | 2010-06-16 | 上海电机学院 | XML-based data integration information exchange platform |
CN101763419A (en) * | 2009-12-28 | 2010-06-30 | 山东大学 | Method for synchronously updating remote rss data by local database |
US7752224B2 (en) * | 2005-02-25 | 2010-07-06 | Microsoft Corporation | Programmability for XML data store for documents |
-
2012
- 2012-04-26 CN CN201210126411.6A patent/CN102799602B/en not_active Expired - Fee Related
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1672523A2 (en) * | 2004-12-20 | 2006-06-21 | Microsoft Corporation | Method and system for linking data ranges of a computer-generated document with associated extensible markup language elements |
US7752224B2 (en) * | 2005-02-25 | 2010-07-06 | Microsoft Corporation | Programmability for XML data store for documents |
CN2852542Y (en) * | 2005-11-07 | 2006-12-27 | 国网北京电力建设研究院 | Wireless communication base station for transmission line monitoring |
CN101739421A (en) * | 2008-11-21 | 2010-06-16 | 上海电机学院 | XML-based data integration information exchange platform |
CN101763419A (en) * | 2009-12-28 | 2010-06-30 | 山东大学 | Method for synchronously updating remote rss data by local database |
Also Published As
Publication number | Publication date |
---|---|
CN102799602A (en) | 2012-11-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Bruns | Faster than the speed of print: Reconciling'big data'social media analysis and academic scholarship | |
CN101894168B (en) | Method and system for layout display of web page of mobile terminal | |
US8869025B2 (en) | Method and system for identifying advertisement in web page | |
US20100118035A1 (en) | Moving image generation method, moving image generation program, and moving image generation device | |
US20120128334A1 (en) | Apparatus and method for mashup of multimedia content | |
CN101753559B (en) | Network resource obtaining system and network resource list obtaining method | |
CN105868276A (en) | Webpage displaying method and device thereof | |
CN104090757B (en) | For the rich media information methods of exhibiting of browser | |
CN102724586B (en) | Page caching method and page caching system based on internet protocol television (IPTV) | |
CN104090923B (en) | The methods of exhibiting and device of a kind of rich media information in browser | |
CN109683978A (en) | A kind of method, apparatus and electronic equipment of the rendering of streaming layout interface | |
CN105931161A (en) | Teaching material resource management system and teaching material resource management method | |
Andersson et al. | Mobile e-services using HTML5 | |
Gil-Jaurena | Openness in higher education | |
CN102799602B (en) | A kind of method and system that data are obtained from internet | |
CN102523512A (en) | Video output method with operable implicit content | |
Zervas et al. | Enhancing educational metadata with mobile assisted language learning information | |
CN102289494A (en) | System and method for generating video on demand one-way or two-way WEB navigation page | |
CN103218358A (en) | Diff scoring method and system | |
Fauzi et al. | Transformation and Challenges of Digital Journalism in Aceh | |
Bangani et al. | Media as a scholarly source of information: citations for legal theses and dissertations | |
Atnafu | Local internet content: the case of Ethiopia | |
CN102779146A (en) | Method and system for updating data in local database in real time | |
CN102799604B (en) | A kind of method and system to save historical data in information broadcasting system database | |
KR101331533B1 (en) | Mobile device capable of providing optional information considering screen size |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20190320 Address after: 100195 No. 621, 6th floor, No. 1 Building, 131 North West Fourth Ring Road, Haidian District, Beijing Patentee after: Beijing Xinaote Intelligent Sports Innovation Development Co., Ltd. Address before: 100195 new technology building, 49 Wukesong Road, Haidian District, Beijing Patentee before: China Digital Video (Beijing) Limited |
|
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20180316 Termination date: 20200426 |