CN102799602B - A kind of method and system that data are obtained from internet - Google Patents

A kind of method and system that data are obtained from internet Download PDF

Info

Publication number
CN102799602B
CN102799602B CN201210126411.6A CN201210126411A CN102799602B CN 102799602 B CN102799602 B CN 102799602B CN 201210126411 A CN201210126411 A CN 201210126411A CN 102799602 B CN102799602 B CN 102799602B
Authority
CN
China
Prior art keywords
xml file
rss
xml
target database
stored
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201210126411.6A
Other languages
Chinese (zh)
Other versions
CN102799602A (en
Inventor
王征
赵海军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xinaote Intelligent Sports Innovation Development Co., Ltd.
Original Assignee
China Digital Video Beijing Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Digital Video Beijing Ltd filed Critical China Digital Video Beijing Ltd
Priority to CN201210126411.6A priority Critical patent/CN102799602B/en
Publication of CN102799602A publication Critical patent/CN102799602A/en
Application granted granted Critical
Publication of CN102799602B publication Critical patent/CN102799602B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses a kind of method and system that data are obtained from internet, methods described specifically includes:Expandable mark language XML file is obtained from network data provider;Whether the XML file that judgement is got is legal, if legal, analyzes the XML file, if meeting the reference format of really simple syndication (RSS), the XML file is RSS format, is otherwise off-gauge RSS format;Otherwise, the XML file is obtained from network data provider again;The XML file is stored in target database according to different-format adaptability.The present invention can include RSS and off-gauge RSS XML file by Intelligent Recognition different-format from internet, store in target database, improve the flexibility that data are obtained from internet, provide the user more convenient and real-time Internet resources.

Description

A kind of method and system that data are obtained from internet
Technical field
It is more particularly to a kind of to obtain the method for data from internet and be the present invention relates to technical field of Internet information System.
Background technology
Due to the rapid development of information technology, the world has come into the epoch of information, and information is numerous and jumbled, due to information Utilization can be provided for some crowds, thus is considered as a kind of resource, these can provide the information utilized by title information.So-called information is broadcast Go out system, be also image-text information broadcasting system, be for traditional television broadcast system.Traditional video broadcasts system System be all using broadcast activity television image and sound accompaniment as main task, and information broadcasting system be using word, figure, chart as It is major-minor with dynamic image, the system for propagating various information.It can (Info channel, TV be purchased with one television channel of complete independently Thing channel) broadcast, can also be attached in traditional broadcast system, increase the broadcast information content of channel.Existing information is broadcasted System has following characteristic:1st, picture, video, upper rolling, Zuo Fei, animation footmark broadcast 2, multirow information real time modifying with screen 3, all kinds of TV column templates of customizing are broadcasted in real time, and column packaging directly applies mechanically 4, be board-like versatile and flexible, can arbitrarily set Multiple advertisement positions 5, infinite Layer captions are superimposed in 6, advertisement show window in real time can add a large amount of display advertising information and animation file, and And can there are title and text information 7 in every advertising message, Financial Information, exchange rate window, stock market's wind and cloud, day can be broadcasted simultaneously Gas forecast etc..The data broadcasted in information broadcasting system obtain from network data provider.
Extensible markup language (Extensible Markup Language, XML), for marking e-file to make its tool Have structural markup language, can be used for flag data, define data type, be it is a kind of permission user to the mark language of oneself Say the original language being defined.XML is standard generalized markup language (SGML) subset, is especially suitable for Web transmission.XML is provided Unified method describes and exchanged the structural data independently of application program or supplier.
Wherein, RSS is one of form of XML file, and RSS (Simple Syndication, being also aggregated content) is a kind of description With the form of synchronous web site contents.RSS can be one of them of following three explanations:Really Simple Syndication;RDF(Resource Description Framework)Site Summary;Rich Site Summary.But these three explain the technology all referring to same Syndication in fact.RSS is now widely used in cyber journalism Channel, blog and wiki, main version have 0.91,1.0,2.0.Information can quickly be obtained by being subscribed to using RSS, and website provides RSS is exported, and is advantageous to the latest update for allowing user to obtain web site contents.
During the present invention is realized in inventor, there is following defect in discovery in the prior art:Obtained from internet When taking XML file, subscription acquisition can only be carried out to the data of single form, it is impossible to while the data of multiple format are known Not.
The content of the invention
For in the prior art the defects of, the present invention can from internet Intelligent Recognition different-format include RSS with it is non- The RSS of standard XML file, improve the flexibility that data are obtained from internet, provide the user it is more convenient and Real-time Internet resources.
In order to solve above technical problem the invention provides a kind of method that data are obtained from internet, specifically include:
Expandable mark language XML file is obtained from network data provider;
Whether the XML file that judgement is got is legal, if legal, the XML file is analyzed, if meeting in polymerization Hold RSS reference format, then the XML file is RSS format, is otherwise off-gauge RSS format;Otherwise, again from network Metadata provider obtains the XML file;
The XML file is stored in target database according to different-format adaptability, specifically included:
When the form of the XML file is RSS, it is stored in after parsing in the target database;Or, when XML texts When the form of part is off-gauge RSS, it is directly stored in the target database.
Wherein, it is described to obtain expandable mark language XML file from network data provider, specifically include:
The XML addresses are imported with parametric form according to user's request;
Analyze the corresponding URL link of XML address acquisitions;
The XML file is obtained by reading the URL link.
Wherein, judge that whether the XML file that gets is legal, specifically includes:
It is whether legal according to the XML file that the judgement of XML syntactic properties is got.
Wherein, it is described to be stored in when the form of the XML file is RSS after parsing in the target database, specific bag Include:
When the form of the XML file is RSS, the target database T_XmlRss is stored in linescan method after parsing In table.
It is wherein, described to be directly stored in when the form of the XML file is off-gauge RSS in the target database, Specifically include:
When the form of the XML file is off-gauge RSS, the XML is directly stored in the target database T_ In XmlOriginal tables.
Present invention also offers a kind of system that data are obtained from internet, specifically include:
Acquiring unit, for obtaining expandable mark language XML file from network data provider;
Whether judging unit, the XML file for judging to get are legal;
Analytic unit, for analyzing the XML file, if meeting the reference format of really simple syndication (RSS), the XML texts Part is RSS format, is otherwise off-gauge RSS format;
Memory cell, for the XML file adaptability of different-format to be stored in into target database, wherein, also specifically include: Resolution unit, for when the form of the XML file is RSS, being stored in after parsing in the target database;Or, when described When the form of XML file is off-gauge RSS, it is directly stored in the target database.
Wherein, the acquiring unit specifically includes import unit, analytic unit and reading unit, wherein,
Import unit, for importing the XML addresses according to user's request with parametric form;
Analytic unit, for analyzing the corresponding URL link of XML address acquisitions;
Reading unit, for obtaining the XML file by reading the URL link.
Wherein, judging unit is specifically used for:
It is whether legal according to the XML file that the judgement of XML syntactic properties is got.
Wherein, it is described to be stored in when the form of the XML file is RSS after parsing in the target database, specific bag Include:
When the form of the XML file is RSS, the target database T_XmlRss is stored in linescan method after parsing In table.
It is wherein, described to be directly stored in when the form of the XML file is off-gauge RSS in the target database, Specifically include:
When the form of the XML file is off-gauge RSS, the XML is directly stored in the target database T_ In XmlOriginal tables.
Compared with prior art, the embodiment of the present invention has advantages below:Pass through the Intelligent Recognition not apposition from internet Formula includes RSS and off-gauge RSS XML file, stores in target database, number is obtained from internet so as to improve According to flexibility, provide the user more convenient and real-time Internet resources.
Brief description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to embodiment or existing There is the required accompanying drawing used in technology description to be briefly described, it should be apparent that, drawings in the following description are only this Some embodiments of invention, for those of ordinary skill in the art, without having to pay creative labor, may be used also To obtain other accompanying drawings according to these accompanying drawings.
Fig. 1:It is a kind of flow chart for the method that data are obtained from internet in the embodiment of the present invention 1;
Fig. 2:It is a kind of structure chart for the system that data are obtained from internet in the embodiment of the present invention 2.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation describes, it is clear that described embodiment is part of the embodiment of the present invention, rather than whole embodiments.Based on the present invention In embodiment, the every other implementation that those of ordinary skill in the art are obtained under the premise of creative work is not made Example, belongs to the scope of protection of the invention.
A kind of method that data are obtained from internet is provided in the embodiment of the present invention 1, as shown in figure 1, including following step Suddenly:
Step S101, expandable mark language XML file is obtained from network data provider, is specifically included:
The XML addresses are imported with parametric form according to user's request, multiple XML addresses space-separated, if desired Username and password, then with CSV, such as ' xmlReader.exehttp:\\rss.sina.com.cn\ sports.xmlhttp:Singapore.info.afg.xml, user, pass ';
Analyze the corresponding URL link of XML address acquisitions;
The XML file is obtained by reading the URL link.
Step S102, judge that whether the XML file that gets is legal, specifically includes:
It is whether legal according to the XML file that the judgement of XML syntactic properties is got, if legal, implementation steps S103, If illegal, the XML file is obtained from network data provider again.
Step S103, the XML file is analyzed, if meeting the reference format of really simple syndication (RSS), the XML file is RSS format, it is otherwise off-gauge RSS format.
Step S104, the XML file is stored in target database according to different-format adaptability, specifically included:
When the form of the XML file is RSS, it is stored in after parsing in the target database;Or, when XML texts When the form of part is off-gauge RSS, it is directly stored in the target database,
Wherein, it is described to be stored in when the form of the XML file is RSS after parsing in the target database, specific bag Include:
When the form of the XML file is RSS, the target database T_XmlRss is stored in linescan method after parsing In table;
It is wherein, described to be directly stored in when the form of the XML file is off-gauge RSS in the target database, Specifically include:
When the form of the XML file is off-gauge RSS, the XML is directly stored in the target database T_ In XmlOriginal tables.
What the technical scheme of the embodiment of the present invention was brought has the beneficial effect that:Pass through the Intelligent Recognition not apposition from internet Formula includes RSS and off-gauge RSS XML file, is stored in target database, number is obtained from internet so as to improve According to flexibility, provide the user more convenient and real-time Internet resources.
A kind of system that data are obtained from internet is provided in the embodiment of the present invention 2, as shown in Fig. 2 including:
Acquiring unit 201, for obtaining expandable mark language XML file from network data provider;
Wherein, the acquiring unit specifically includes import unit, analytic unit and reading unit, wherein,
Import unit 2011, for importing the XML addresses according to user's request with parametric form;
Analytic unit 2012, for analyzing the corresponding URL link of XML address acquisitions;
Reading unit 2013, for obtaining the XML file by reading the URL link.
Judging unit 202, whether the XML file for judging to get is legal, is specially:
It is whether legal according to the XML file that the judgement of XML syntactic properties is got.
Analytic unit 203, for analyzing the XML file, if meeting the reference format of really simple syndication (RSS), the XML File is RSS format, is otherwise off-gauge RSS format.
Memory cell 204, for the XML file adaptability of different-format to be stored in into target database, wherein, also specific bag Include:Resolution unit 2031, for when the form of the XML file is RSS, being stored in after parsing in the target database;Or, When the form of the XML file is off-gauge RSS, it is directly stored in the target database,
Wherein, it is described to be stored in when the form of the XML file is RSS after parsing in the target database, specific bag Include:
When the form of the XML file is RSS, the target database T_XmlRss is stored in linescan method after parsing In table.
It is wherein, described to be directly stored in when the form of the XML file is off-gauge RSS in the target database, Specifically include:
When the form of the XML file is off-gauge RSS, the XML is directly stored in the target database T_ In XmlOriginal tables.
What the technical scheme of the embodiment of the present invention was brought has the beneficial effect that:Pass through the Intelligent Recognition not apposition from internet Formula includes RSS and off-gauge RSS XML file, is stored in target database, number is obtained from internet so as to improve According to flexibility, provide the user more convenient and real-time Internet resources.
Through the above description of the embodiments, those skilled in the art can be understood that the present invention can lead to Hardware realization is crossed, the mode of necessary general hardware platform can also be added by software to realize.Based on such understanding, this hair Bright technical scheme can be embodied in the form of software product, and the software product can be stored in a non-volatile memories In medium (can be CD-ROM, USB flash disk, mobile hard disk etc.), including some instructions are causing a computer equipment (can be Personal computer, server, or network equipment etc.) perform method described in each embodiment of the present invention.
It will be appreciated by those skilled in the art that accompanying drawing is the schematic diagram of a preferred embodiment, module or stream in accompanying drawing Journey is not necessarily implemented necessary to the present invention.
It will be appreciated by those skilled in the art that the module in device in embodiment can describe be divided according to embodiment It is distributed in the device of embodiment, respective change can also be carried out and be disposed other than in one or more devices of the present embodiment.On The module for stating embodiment can be merged into a module, can also be further split into multiple submodule.
The embodiments of the present invention are for illustration only, do not represent the quality of embodiment.
Disclosed above is only several specific embodiments of the present invention, and still, the present invention is not limited to this, any ability What the technical staff in domain can think change should all fall into protection scope of the present invention.

Claims (8)

  1. A kind of 1. method that data are obtained from internet, it is characterised in that including:
    Expandable mark language XML file is obtained from network data provider, is specifically included:According to user's request with parametric form Import the XML addresses;Analyze the corresponding URL link of XML address acquisitions;Obtained by reading the URL link To the XML file;
    Whether the XML file that judgement is got is legal, if legal, the XML file is analyzed, if meeting aggregated content RSS reference format, then the XML file is RSS format, is otherwise off-gauge RSS format;Otherwise, again from network number The XML file is obtained according to provider;
    The XML file is stored in target database according to different-format adaptability, specifically included:
    When the form of the XML file is RSS, it is stored in after parsing in the target database;Or, when the XML file When form is off-gauge RSS, it is directly stored in the target database.
  2. 2. the method as described in claim 1, it is characterised in that judge whether the XML file that gets is legal, specific bag Include:
    It is whether legal according to the XML file that the judgement of XML syntactic properties is got.
  3. 3. the method as described in claim 1, it is characterised in that it is described when the form of the XML file is RSS, after parsing It is stored in the target database, specifically includes:
    When the form of the XML file is RSS, the target database T_XmlRss tables are stored in linescan method after parsing In.
  4. 4. the method as described in claim 1, it is characterised in that described when the form of the XML file is off-gauge RSS When, it is directly stored in the target database, specifically includes:
    When the form of the XML file is off-gauge RSS, the XML is directly stored in the target database T_ In XmlOriginal tables.
  5. A kind of 5. system that data are obtained from internet, it is characterised in that including:
    Acquiring unit, for obtaining expandable mark language XML file from network data provider;
    Whether judging unit, the XML file for judging to get are legal;
    Analytic unit, for analyzing the XML file, if meeting the reference format of really simple syndication (RSS), the XML file is RSS format, it is otherwise off-gauge RSS format;
    Memory cell, for the XML file adaptability of different-format to be stored in into target database, wherein, also specifically include:Parsing Unit, for when the form of the XML file is RSS, being stored in after parsing in the target database;Or, when XML texts When the form of part is off-gauge RSS, it is directly stored in the target database;
    The acquiring unit specifically includes import unit, analytic unit and reading unit, wherein,
    Import unit, for importing the XML addresses according to user's request with parametric form;
    Analytic unit, for analyzing the corresponding URL link of XML address acquisitions;
    Reading unit, for obtaining the XML file by reading the URL link.
  6. 6. system as claimed in claim 5, it is characterised in that judging unit is specifically used for:
    It is whether legal according to the XML file that the judgement of XML syntactic properties is got.
  7. 7. system as claimed in claim 5, it is characterised in that it is described when the form of the XML file is RSS, after parsing It is stored in the target database, specifically includes:
    When the form of the XML file is RSS, the target database T_XmlRss tables are stored in linescan method after parsing In.
  8. 8. system as claimed in claim 5, it is characterised in that described when the form of the XML file is off-gauge RSS When, it is directly stored in the target database, specifically includes:When the form of the XML file is off-gauge RSS, directly The XML is stored in the target database T_XmlOriginal tables.
CN201210126411.6A 2012-04-26 2012-04-26 A kind of method and system that data are obtained from internet Expired - Fee Related CN102799602B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210126411.6A CN102799602B (en) 2012-04-26 2012-04-26 A kind of method and system that data are obtained from internet

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210126411.6A CN102799602B (en) 2012-04-26 2012-04-26 A kind of method and system that data are obtained from internet

Publications (2)

Publication Number Publication Date
CN102799602A CN102799602A (en) 2012-11-28
CN102799602B true CN102799602B (en) 2018-03-16

Family

ID=47198714

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210126411.6A Expired - Fee Related CN102799602B (en) 2012-04-26 2012-04-26 A kind of method and system that data are obtained from internet

Country Status (1)

Country Link
CN (1) CN102799602B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1672523A2 (en) * 2004-12-20 2006-06-21 Microsoft Corporation Method and system for linking data ranges of a computer-generated document with associated extensible markup language elements
CN2852542Y (en) * 2005-11-07 2006-12-27 国网北京电力建设研究院 Wireless communication base station for transmission line monitoring
CN101739421A (en) * 2008-11-21 2010-06-16 上海电机学院 XML-based data integration information exchange platform
CN101763419A (en) * 2009-12-28 2010-06-30 山东大学 Method for synchronously updating remote rss data by local database
US7752224B2 (en) * 2005-02-25 2010-07-06 Microsoft Corporation Programmability for XML data store for documents

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1672523A2 (en) * 2004-12-20 2006-06-21 Microsoft Corporation Method and system for linking data ranges of a computer-generated document with associated extensible markup language elements
US7752224B2 (en) * 2005-02-25 2010-07-06 Microsoft Corporation Programmability for XML data store for documents
CN2852542Y (en) * 2005-11-07 2006-12-27 国网北京电力建设研究院 Wireless communication base station for transmission line monitoring
CN101739421A (en) * 2008-11-21 2010-06-16 上海电机学院 XML-based data integration information exchange platform
CN101763419A (en) * 2009-12-28 2010-06-30 山东大学 Method for synchronously updating remote rss data by local database

Also Published As

Publication number Publication date
CN102799602A (en) 2012-11-28

Similar Documents

Publication Publication Date Title
Bruns Faster than the speed of print: Reconciling'big data'social media analysis and academic scholarship
CN101894168B (en) Method and system for layout display of web page of mobile terminal
US8869025B2 (en) Method and system for identifying advertisement in web page
US20100118035A1 (en) Moving image generation method, moving image generation program, and moving image generation device
US20120128334A1 (en) Apparatus and method for mashup of multimedia content
CN101753559B (en) Network resource obtaining system and network resource list obtaining method
CN105868276A (en) Webpage displaying method and device thereof
CN104090757B (en) For the rich media information methods of exhibiting of browser
CN102724586B (en) Page caching method and page caching system based on internet protocol television (IPTV)
CN104090923B (en) The methods of exhibiting and device of a kind of rich media information in browser
CN109683978A (en) A kind of method, apparatus and electronic equipment of the rendering of streaming layout interface
CN105931161A (en) Teaching material resource management system and teaching material resource management method
Andersson et al. Mobile e-services using HTML5
Gil-Jaurena Openness in higher education
CN102799602B (en) A kind of method and system that data are obtained from internet
CN102523512A (en) Video output method with operable implicit content
Zervas et al. Enhancing educational metadata with mobile assisted language learning information
CN102289494A (en) System and method for generating video on demand one-way or two-way WEB navigation page
CN103218358A (en) Diff scoring method and system
Fauzi et al. Transformation and Challenges of Digital Journalism in Aceh
Bangani et al. Media as a scholarly source of information: citations for legal theses and dissertations
Atnafu Local internet content: the case of Ethiopia
CN102779146A (en) Method and system for updating data in local database in real time
CN102799604B (en) A kind of method and system to save historical data in information broadcasting system database
KR101331533B1 (en) Mobile device capable of providing optional information considering screen size

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20190320

Address after: 100195 No. 621, 6th floor, No. 1 Building, 131 North West Fourth Ring Road, Haidian District, Beijing

Patentee after: Beijing Xinaote Intelligent Sports Innovation Development Co., Ltd.

Address before: 100195 new technology building, 49 Wukesong Road, Haidian District, Beijing

Patentee before: China Digital Video (Beijing) Limited

CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20180316

Termination date: 20200426