WO2015188431A1 - 资源的下载方法及装置 - Google Patents

资源的下载方法及装置 Download PDF

Info

Publication number
WO2015188431A1
WO2015188431A1 PCT/CN2014/083594 CN2014083594W WO2015188431A1 WO 2015188431 A1 WO2015188431 A1 WO 2015188431A1 CN 2014083594 W CN2014083594 W CN 2014083594W WO 2015188431 A1 WO2015188431 A1 WO 2015188431A1
Authority
WO
WIPO (PCT)
Prior art keywords
url
resource
resources
dom tree
remaining
Prior art date
Application number
PCT/CN2014/083594
Other languages
English (en)
French (fr)
Inventor
曹刚
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Priority to US15/317,122 priority Critical patent/US10262341B2/en
Priority to EP14894424.2A priority patent/EP3142020A4/en
Publication of WO2015188431A1 publication Critical patent/WO2015188431A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0277Online advertisement
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • G06F16/9566URL specific, e.g. using aliases, detecting broken or misspelled links
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/06Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/34Network arrangements or protocols for supporting network services or applications involving the movement of software or configuration parameters 

Definitions

  • the present invention relates to the field of communications, and in particular to a method and apparatus for downloading resources.
  • BACKGROUND OF THE INVENTION With the rapid development of wireless communication technologies and Internet technologies, more and more users use mobile browsers to access the Internet on mobile terminals. As a mobile Internet portal, the importance of the browser is self-evident. How to improve the user experience of the browser on the mobile terminal, and thus gaining a bright spot in the fierce market competition and occupying the market share has become the focus of current browser technology research. Under normal circumstances, when using a mobile terminal browser to browse some video and music websites, the user finds that the favorite video or music on the webpage can only be played online, and there is no operation button that can download the multimedia file.
  • the present invention provides a method and apparatus for downloading resources, so as to at least solve the related art, how to use a browser to media files in a case where media files of a website such as video or music are only provided for online playback without supporting local downloading. Sniffing to provide users with local download issues. According to an aspect of the present invention, a method of downloading a resource is provided.
  • the method for downloading a resource includes: dynamically detecting a document object model (DOM) tree of a webpage where the download resource is located, acquiring a plurality of uniform resource locator (URL) resources; filtering out the plurality of URL resources Part of the URL resource corresponding to the advertisement data; prompting the user to download the URL resource remaining after filtering the advertisement data.
  • DOM document object model
  • URL uniform resource locator
  • the real-time detection is performed on the DOM tree, and obtaining the plurality of URL resources includes: determining whether there is a label in the preset label set in the DOM tree, where the preset label set includes at least one of the following: a video label, an audio (audio) tag, object tag; if it exists, get the hypertext reference (href) attribute corresponding to the tag, and extract multiple URL resources from the href attribute.
  • the preset label set includes at least one of the following: a video label, an audio (audio) tag, object tag; if it exists, get the hypertext reference (href) attribute corresponding to the tag, and extract multiple URL resources from the href attribute.
  • filtering the partial URL resource from the plurality of URL resources comprises: receiving the most recently updated advertisement interception data information from the server, wherein the advertisement interception data information comprises: identifying information of the partial URL resource and determining a part of the URL resource to be The feature information of the intercepted advertisement data; the advertisement interception data information is used to filter out part of the URL resources from the plurality of URL resources.
  • prompting the user to download the remaining URL resources comprises: naming the remaining URL resources; displaying the file names of the remaining URL resources according to a preset display manner.
  • the naming of the remaining URL resources includes: setting a title of a webpage where each of the remaining URL resources is located as a first file name; setting a last N-bit character of each URL resource to a second file name , where N is a positive integer; the first file name and the second file name are combined, and each URL resource is named.
  • the dynamic detection of the DOM tree includes one of the following: detecting the DOM tree according to a preset period; automatically detecting the DOM tree by the webpage background script; and the webpage resource loading event triggered by the user clicking the preset button is After the capture, the D0M tree is triggered to be checked.
  • the method further includes: performing URL verification on the remaining URL resources by using a preset URL specification.
  • a resource downloading apparatus includes: a detecting module, configured to dynamically detect a DOM tree of a webpage where the resource to be downloaded is to obtain a plurality of URL resources; and a filtering module configured to filter and advertise from the plurality of URL resources The partial URL resource corresponding to the data; the processing module is configured to prompt the user to download the URL resource remaining after filtering the advertisement data.
  • the detecting module includes: a determining unit, configured to determine whether there is a label in the preset label set in the DOM tree, where the preset label set includes at least one of the following: a video label, an audio label, an object label, and an extracting unit, When the judgment unit output is YES, the href attribute corresponding to the label is obtained, and multiple URL resources are extracted from the href attribute.
  • a determining unit configured to determine whether there is a label in the preset label set in the DOM tree, where the preset label set includes at least one of the following: a video label, an audio label, an object label, and an extracting unit, When the judgment unit output is YES, the href attribute corresponding to the label is obtained, and multiple URL resources are extracted from the href attribute.
  • the filtering module includes: a receiving unit, configured to receive recently updated advertisement interception data information from the server, where the advertisement interception data information includes: identification information of the partial URL resource and determining the partial URL resource as the advertisement data to be intercepted Characteristic information; a filtering unit configured to filter part of the URL resource from the plurality of URL resources by using the advertisement intercepting data information.
  • the processing module comprises: a naming unit configured to name the remaining URL resources; and a display unit configured to display the file names of the remaining URL resources according to a preset display manner.
  • the naming unit comprises: a first setting subunit, configured to set a title of a webpage where each URL resource in the remaining URL resources is located as a first file name; and a second setting subunit, set to each URL resource The last N-bit character is set to the second file name, where N is a positive integer; the combination sub-unit is set to combine the first file name and the second file name, and each URL resource is named.
  • the detecting module is configured to dynamically detect the DOM tree according to one of the following ways: detecting the DOM tree according to a preset period; automatically detecting the DOM tree by the webpage background script; and triggering by the user clicking the preset button The web page resource loading event is triggered to trigger detection of the DOM tree.
  • the apparatus further includes: a verification module configured to perform URL verification on the remaining URL resources by using a preset URL specification.
  • a verification module configured to perform URL verification on the remaining URL resources by using a preset URL specification.
  • the DOM tree of the webpage where the resource is to be downloaded is dynamically detected to obtain a plurality of URL resources; and part of the URL resources corresponding to the advertisement data are filtered out from the plurality of URL resources; and the user is prompted to filter the advertisement data.
  • the remaining URL resources are downloaded, which solves the related art how to use the browser to media files when the media files of websites such as video or music are only provided for online playback without supporting local downloads.
  • FIG. 1 is a flowchart of a method for downloading resources according to an embodiment of the present invention
  • FIG. 2 is a flowchart of a method for dynamically detecting media resources of a webpage according to a preferred embodiment of the present invention
  • FIG. 4 is a flowchart of a method for naming and prompting a web page sniffing download file according to a preferred embodiment of the present invention
  • FIG. 6 is a structural block diagram of a resource downloading apparatus according to a preferred embodiment of the present invention
  • FIG. 7 is a browser dynamics of a filterable advertisement according to a preferred embodiment of the present invention
  • FIG. 1 is a flowchart of a method of downloading a resource according to an embodiment of the present invention.
  • the method may include the following processing steps: Step S102: Dynamically detecting a DOM tree of a webpage where the download resource is located, and acquiring a plurality of URL resources; Step S104: Filtering, corresponding to the advertisement data, from the plurality of URL resources Part of the URL resource; Step S106: prompting the user to download the URL resource remaining after filtering the advertisement data.
  • the use of an existing browser to sniff the media file provides the user with a local download lacking accuracy and flexibility. Sex.
  • Using the method shown in FIG. 1 to dynamically detect the DOM tree of the webpage where the resource is to be downloaded, and obtaining multiple URL resources can avoid the leak detection or false detection in the existing browser sniffing technology, by using multiple Filtering part of the URL resource corresponding to the advertisement data in the URL resource can avoid serious interference of the advertisement media file, and then prompting the user to download the URL resource remaining after filtering the advertisement data, thereby solving the related art in video or music, etc.
  • step S102 the DOM tree is detected in real time, and obtaining the plurality of URL resources may include the following operations: Step S1: determining whether there is a label in the preset label set in the DOM tree, where the preset label set includes the following At least one of: a video tag, an audio tag, an object tag; step S2: if present, obtain a hypertext reference (href) attribute corresponding to the tag, and extract multiple URLs from the href attribute Resources.
  • Step S1 determining whether there is a label in the preset label set in the DOM tree, where the preset label set includes the following At least one of: a video tag, an audio tag, an object tag
  • step S2 if present, obtain a hypertext reference (href) attribute corresponding to the tag, and extract multiple URLs from the href attribute Resources.
  • href hypertext reference
  • the dynamic detection of the DOM tree may include one of the following methods: Method 1: detecting the DOM tree according to a preset period; Method 2: automatically triggering detection of the DOM tree by a webpage background script; The page sub-resource loading event triggered by the user clicking the preset button is triggered to trigger detection of the DOM tree.
  • FIG. 2 is a flow chart of a method for dynamically detecting a media resource of a webpage according to a preferred embodiment of the present invention. As shown in FIG.
  • Step S202 After receiving the browser kernel webpage loading end signal, start to start the detection operation of the DOM tree of the webpage; Step S204: If the current webpage of the DOM is detected When the tree contains ⁇ ( ⁇ 0 > and / or ⁇ audio> and / or ⁇ 0 6 > tags, the hypertext reference (Hypertext Reference, referred to as href) of the node corresponding to the tag is obtained.
  • the attribute that is, the download URL resource of the audio or video file, continues to step S206; otherwise, the process goes to step S208;
  • Step S206 determining whether the download resource URL obtained by the current web page is duplicated, if not, the advertisement filtering may be continued.
  • Step S208 Start the webpage sub-resource loading and listening process, when the webpage background Java script (JavaScript) is automatically initiated or the webpage sub-resource loading event caused by the user clicking the play button is
  • the step S202 is again notified to perform the detection; by the dynamic monitoring detection, the changes occurring in the internal nodes of the DOM tree can be monitored in real time, thereby avoiding the missed detection of the sniffing.
  • step S104 filtering a part of the URL resources from the plurality of URL resources may include the following steps: Step S3: receiving recently updated advertisement interception data information from the server, where the advertisement interception data information may include but not It is limited to: identification information of a part of the URL resource (for example: website host name (HostName) information), and determining that part of the URL resource is characteristic information of the advertisement data to be intercepted (for example: keyword information of the advertisement data corresponding to the HostName information); S4: Filtering part of the URL resource from multiple URL resources by using the advertisement interception data information.
  • FIG. 1 receives advertisement interception data information from the server
  • the advertisement interception data information may include but not It is limited to: identification information of a part of the URL resource (for example: website host name (HostName) information), and determining that part of the URL resource is characteristic information of the advertisement data to be intercepted (for example: keyword information of the advertisement data corresponding to the HostName information);
  • S4 Filtering part of the URL
  • Step S302 Obtain an latest advertisement interception data table from a server by sending an update request message to the server, where the advertisement interception data table may include two fields: The website host name (ie HostName) of some popular video and/or audio websites, and the keyword key contained in the URL corresponding to the advertisement played by the website, wherein HostName is unique in the advertisement interception data table, and A HostName may correspond to multiple keys;
  • Step S304 Match the obtained candidate download resource URL with the advertisement interception data.
  • the HostName may be obtained from the URL of the current webpage itself; then, in the advertisement interception data table, it is found whether the HostName has a corresponding key; finally, it is checked whether the keys are included in the alternate download resource URL, if included
  • the description shows that the download resource URL is directly filtered by the advertisement resource; otherwise, step S306 is continued; Step S306: Determine whether the URL of the download resource filtered by the advertisement conforms to the URL specification, and if not, directly filter out, if yes, continue to perform the process of naming the downloaded file.
  • prompting the user to download the remaining URL resources may include the following operations: Step S5: Name the remaining URL resources; Step S6: Perform file name of the remaining URL resources according to a preset display manner. display.
  • the naming of the remaining URL resources may include the following steps: Step S51: setting a title of a webpage where each of the remaining URL resources is located as a first file name; Step S52: The last N-bit character of the URL resource is set to the second file name, where N is a positive integer; Step S53: The first file name and the second file name are combined, and each URL resource is named.
  • Step S402 The webpage title is used as the main file name of the download file name.
  • the webpage title may include more accurate information in the download file;
  • Step S404 After extracting the URL address of the download resource, the N-bit (N is a positive integer) character is used as the sub-file name of the downloaded file name, for example: 8 bits after the general extraction, mainly considering retaining the file type suffix as much as possible;
  • Step S406 The main file is to be deleted The name and the sub-file name are combined as the file name of the download resource;
  • Step S408 The user is prompted to sniff the downloadable media file on the webpage, and the file name of the downloaded resource is displayed in a list manner.
  • Step S9 Perform URL verification on the remaining URL resources by using a preset URL specification. That is, it is necessary to verify the URL download resource after filtering the advertisement to determine whether it conforms to the URL specification. Normally, only when the URL downloading resource after filtering the advertisement conforms to the URL specification, the user is prompted to sniff on the webpage. The downloaded media file, and the file name of the downloaded resource is displayed in a list manner for the user to download.
  • FIG. 5 is a structural block diagram of a resource downloading apparatus according to an embodiment of the present invention. As shown in FIG.
  • the downloading device of the resource may include: a detecting module 10 configured to dynamically detect a DOM tree of a webpage where the resource to be downloaded is to obtain a plurality of URL resources; and the filtering module 20 is configured to be configured from multiple URL resources.
  • the partial URL resource corresponding to the advertisement data is filtered out; the processing module 30 is configured to prompt the user to download the URL resource remaining after filtering the advertisement data.
  • the detecting module 10 may include: a determining unit 100, configured to determine whether a label in the preset label set exists in the DOM tree, where the preset label set includes at least one of the following: a video label,
  • the audio tag and the object tag are configured to: when the output of the determining unit is YES, obtain the href attribute corresponding to the tag, and extract a plurality of URL resources from the href attribute.
  • a determining unit 100 configured to determine whether a label in the preset label set exists in the DOM tree, where the preset label set includes at least one of the following: a video label
  • the audio tag and the object tag are configured to: when the output of the determining unit is YES, obtain the href attribute corresponding to the tag, and extract a plurality of URL resources from the href attribute.
  • the filtering module 20 may include: a receiving unit 200 configured to receive recently updated advertisement interception data information from the server, where the advertisement interception data information includes: identification information of the partial URL resource and the determination The partial URL resource is feature information of the advertisement data to be intercepted; the filtering unit 202 is configured to filter part of the URL resource from the plurality of URL resources by using the advertisement interception data information.
  • the processing module 30 may include: a naming unit 300, configured to
  • the URL resource is named; the display unit 302 is configured to display the file name of the remaining URL resources according to the preset display manner.
  • the naming unit 300 may include: a first setting subunit (not shown), configured to set a title of a webpage where each of the remaining URL resources is located as a first file name; a unit (not shown), configured to set a last N-bit character of each URL resource to a second file name, where N is a positive integer; a combination sub-unit (not shown) is set to be A file name is combined with a second file name to name each URL resource.
  • the detecting module 10 is configured to dynamically detect the DOM tree according to one of the following ways: detecting the DOM tree according to a preset period; automatically triggering detection of the DOM tree by the webpage background script; and triggering by the user clicking the preset button
  • the web page sub-resource loading event is triggered to trigger detection of the DOM tree.
  • the foregoing apparatus may further include: a verification module 40 configured to perform URL verification on the remaining URL resources by using a preset URL specification.
  • FIG. 7 is a schematic diagram of an overall architecture of a browser dynamic sniffing that can filter advertisements in accordance with a preferred embodiment of the present invention. As shown in FIG.
  • the method may include: a media detection module (corresponding to the foregoing detection module), a sub-resource loading and listening module, an advertisement interception data update module, and an advertisement interception execution module (corresponding to the filtering module),
  • the URL verification module (equivalent to the above verification module)
  • the file naming module (corresponding to the above naming unit)
  • the prompting and downloading module (corresponding to some functions of the above processing module).
  • the functions implemented by each module are as follows: (1)
  • the media detection module is mainly responsible for retrieving the node of the relevant media tag in the current Obj ect Document Model (DOM) tree of the webpage, and obtaining corresponding corresponding from the node. Downloadable
  • the sub-resource loading and listening module is mainly responsible for monitoring whether there are sub-resources to be loaded in the entire life cycle of the webpage file, so as to facilitate the media detection module to initiate re-detection, so that the downloading resource can be dynamically sniffed.
  • the advertisement interception data update module is mainly responsible for periodically sending a data update request to the server to obtain the latest advertisement interception information, wherein the latest advertisement interception information may include: an advertisement URL on some current mainstream audio and/or video websites.
  • the keyword for the resource may include: an advertisement URL on some current mainstream audio and/or video websites.
  • the advertisement interception execution module is mainly responsible for detecting the URL obtained by the media detection module through the advertisement interception information updated by the server, and determining whether it is included in the blacklist (ie, matching the keyword in the advertisement interception information) Deleting the URLs contained in the blacklist.
  • the URL verification module is mainly responsible for verifying the URL download resource after filtering the advertisement, and judging whether it conforms to the URL specification.
  • the file naming module is mainly responsible for obtaining the title information and the URL address of the current web page in order to construct the name of the downloaded file.
  • the prompting and downloading module is mainly responsible for displaying the sniffed downloading resource information to the user, and performing download management after the user selects the download. From the above description, it can be seen that the above embodiments achieve the following technical effects (it is required that the effects are achievable by some preferred embodiments):
  • the technical solution provided by the embodiment of the present invention can be effectively To solve the leak detection or misdetection existing in the existing browser sniffing technology, especially the serious interference of the advertisement media file and the currently used download file naming manner, so that the user cannot distinguish which download file is the file required by the user himself or the like.
  • the problem allows the user to freely obtain the media files of interest in the webpage that only provides online play, thereby greatly improving the user experience.
  • modules or steps of the present invention can be implemented by a general-purpose computing device, which can be concentrated on a single computing device or distributed over a network composed of multiple computing devices. Alternatively, they may be implemented by program code executable by the computing device, such that they may be stored in the storage device by the computing device and, in some cases, may be different from the order herein.
  • the steps shown or described are performed, or they are separately fabricated into individual integrated circuit modules, or a plurality of modules or steps are fabricated as a single integrated circuit module.
  • the invention is not limited to any specific combination of hardware and software.
  • the above is only the preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes can be made to the present invention.
  • a resource downloading method and apparatus provided by an embodiment of the present invention have the following beneficial effects: It can effectively solve the leak detection or false detection existing in the existing browser sniffing technology, especially an advertisement.
  • the serious interference of the media files and the currently used download file naming method make it impossible for the user to distinguish which download file is the file that the user needs, and thus the user can freely obtain the media file of interest in the webpage that only provides online play. , which greatly enhances the user experience.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • General Engineering & Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Data Mining & Analysis (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

本发明公开了一种资源的下载方法及装置,在上述方法中,对待下载资源所在网页的DOM树进行动态检测,获取多个URL资源;从多个URL资源中滤除与广告数据对应的部分URL资源;提示用户对滤除广告数据后剩余的URL资源进行下载。根据本发明提供的技术方案,进而使得用户在仅提供在线播放的网页中可以随意获取自身感兴趣的媒体文件,从而极大地提升了用户体验。

Description

资源的下载方法及装置 技术领域 本发明涉及通信领域, 具体而言, 涉及一种资源的下载方法及装置。 背景技术 随着无线通讯技术和互联网技术的飞速发展, 在移动终端上使用浏览器进行上网 的用户越来越多。 作为移动互联网入口, 浏览器的重要性不言而喻。 如何在移动终端 上提高浏览器的用户体验, 从而在白热化的市场竞争中取得亮点并占有市场份额已经 成为目前浏览器技术研究的重点。 在通常情况下, 用户在使用移动终端浏览器浏览一些视频、 音乐网站时, 发现在 网页上自己喜欢的视频或音乐只能在线播放, 而没有任何可以对该多媒体文件进行下 载的操作按钮, 从而无法将这些文件下载到本地来随时进行流畅的播放。 而多次在线 播放不仅消耗网络流量,而且受到网络带宽的影响经常在播放过程中会出现卡顿现象, 继而降低了用户体验。 基于上述问题的存在, 一种被称为 "嗅探" 的浏览器技术应运 而生, 其主要原理是在网页资源加载完成后, 对网页内各个标签进行检测, 若检测到 视频或音频等标签则获取它们对应的统一资源定位符 (URL), 经过 URL验证再提示 给用户是否需要下载。通过用户选择提示下载选项对应的 URL即可将多媒体文件下载 到本地进行播放。 然而, 上述常规嗅探方法仍然存在以下缺陷:
( 1 )这是一种静态的检测方法, 然而在网页加载完成后, 其媒体内容经常会发生 动态变化, 因而常规嗅探往往会出现漏检测或误检测的情况。 (2) 很多网页在打开过程中会事先播放一段广告视频等与用户需求不相关的内 容, 而常规的嗅探经常将这些广告嗅探出来交给用户下载, 但是对于用户原本需求的 媒体文件却无法嗅探。
(3 ) 在嗅探阶段往往无法获知下载的文件名, 常规办法只能采用对应 URL结尾 字符串来命名, 用户下载后往往不确定哪个是自己下载的文件。 发明内容 本发明提供了一种资源的下载方法及装置, 以至少解决相关技术中在视频或音乐 等网站的媒体文件仅提供在线播放而不支持本地下载的情况下, 如何使用浏览器对媒 体文件进行嗅探以便向用户提供本地下载的问题。 根据本发明的一个方面, 提供了一种资源的下载方法。 根据本发明实施例的资源的下载方法包括: 对待下载资源所在网页的文档对象模 型 (DOM) 树进行动态检测, 获取多个统一资源定位符 (URL) 资源; 从多个 URL 资源中滤除与广告数据对应的部分 URL资源;提示用户对滤除广告数据后剩余的 URL 资源进行下载。 优选地, 对 DOM树进行实时检测, 获取多个 URL资源包括: 判断 DOM树中是 否存在预设标签集合中的标签, 其中, 预设标签集合包括以下至少之一: 视频(video) 标签、 音频 (audio) 标签、 对象 (object) 标签; 如果存在, 则获取标签对应的超文 本引用 (href) 属性, 并从 href属性中提取多个 URL资源。 优选地, 从多个 URL资源中滤除部分 URL资源包括: 接收来自于服务器的最近 更新的广告拦截数据信息, 其中, 广告拦截数据信息包括: 部分 URL资源的标识信息 以及确定部分 URL资源为待拦截的广告数据的特征信息;采用广告拦截数据信息从多 个 URL资源中滤除部分 URL资源。 优选地, 提示用户对剩余的 URL资源进行下载包括: 对剩余的 URL资源进行命 名; 按照预设显示方式对剩余的 URL资源的文件名进行显示。 优选地,对剩余的 URL资源进行命名包括:将剩余的 URL资源中的每个 URL资 源所在网页的标题设置为第一文件名; 将每个 URL资源的最后 N位字符设置为第二 文件名, 其中, N为正整数; 将第一文件名和第二文件名进行组合, 对每个 URL资源 进行命名。 优选地, 对 D0M树进行动态检测包括以下之一: 按照预设周期对 D0M树进行 检测; 由网页后台脚本自动触发对 D0M树进行检测; 通过用户点击预设按钮引发的 网页子资源加载事件被捕获后触发对 D0M树进行检领 优选地,在提示用户对剩余的 URL资源进行下载之前,还包括:采用预设的 URL 规范对剩余的 URL资源进行 URL验证。 根据本发明的另一方面, 提供了一种资源的下载装置。 根据本发明实施例的资源的下载装置包括: 检测模块, 设置为对待下载资源所在 网页的 DOM树进行动态检测, 获取多个 URL资源; 过滤模块, 设置为从多个 URL 资源中滤除与广告数据对应的部分 URL资源; 处理模块, 设置为提示用户对滤除广告 数据后剩余的 URL资源进行下载。 优选地, 检测模块包括: 判断单元, 设置为判断 DOM树中是否存在预设标签集 合中的标签, 其中, 预设标签集合包括以下至少之一: video标签、 audio标签、 object 标签;提取单元,设置为在判断单元输出为是时,获取标签对应的 href属性, 并从 href 属性中提取多个 URL资源。 优选地, 过滤模块包括: 接收单元, 设置为接收来自于服务器的最近更新的广告 拦截数据信息, 其中, 广告拦截数据信息包括: 部分 URL资源的标识信息以及确定部 分 URL资源为待拦截的广告数据的特征信息; 过滤单元, 设置为采用广告拦截数据信 息从多个 URL资源中滤除部分 URL资源。 优选地, 处理模块包括: 命名单元, 设置为对剩余的 URL资源进行命名; 显示单 元, 设置为按照预设显示方式对剩余的 URL资源的文件名进行显示。 优选地, 命名单元包括: 第一设置子单元, 设置为将剩余的 URL 资源中的每个 URL 资源所在网页的标题设置为第一文件名; 第二设置子单元, 设置为将每个 URL 资源的最后 N位字符设置为第二文件名, 其中, N为正整数; 组合子单元, 设置为将 第一文件名和第二文件名进行组合, 对每个 URL资源进行命名。 优选地, 检测模块, 设置为按照以下方式之一对 D0M树进行动态检测: 按照预 设周期对 D0M树进行检测; 由网页后台脚本自动触发对 D0M树进行检测; 通过用 户点击预设按钮引发的网页子资源加载事件被捕获后触发对 D0M树进行检测。 优选地,上述装置还包括:验证模块,设置为采用预设的 URL规范对剩余的 URL 资源进行 URL验证。 通过本发明实施例, 采用对待下载资源所在网页的 D0M树进行动态检测, 获取 多个 URL资源; 从多个 URL资源中滤除与广告数据对应的部分 URL资源; 提示用户 对滤除广告数据后剩余的 URL资源进行下载,解决了相关技术中在视频或音乐等网站 的媒体文件仅提供在线播放而不支持本地下载的情况下, 如何使用浏览器对媒体文件 进行嗅探以便向用户提供本地下载的问题, 进而使得用户在仅提供在线播放的网页中 可以随意获取自身感兴趣的媒体文件, 从而极大地提升了用户体验。 附图说明 此处所说明的附图用来提供对本发明的进一步理解, 构成本申请的一部分, 本发 明的示意性实施例及其说明用于解释本发明, 并不构成对本发明的不当限定。 在附图 中: 图 1是根据本发明实施例的资源的下载方法的流程图; 图 2是根据本发明优选实施例的对网页的媒体资源进行动态检测方法的流程图; 图 3是根据本发明优选实施例的对网页嗅探的资源进行广告滤除和验证方法的流 程图; 图 4是根据本发明优选实施例的对网页嗅探下载文件进行命名和提示方法的流程 图; 图 5是根据本发明实施例的资源的下载装置的结构框图; 图 6是根据本发明优选实施例的资源的下载装置的结构框图; 图 7 是根据本发明优选实施例的可滤除广告的浏览器动态嗅探的总体架构示意 图。 具体实施方式 下文中将参考附图并结合实施例来详细说明本发明。 需要说明的是, 在不冲突的 情况下, 本申请中的实施例及实施例中的特征可以相互组合。 图 1是根据本发明实施例的资源的下载方法的流程图。 如图 1所示, 该方法可以 包括以下处理步骤: 步骤 S102: 对待下载资源所在网页的 DOM树进行动态检测, 获取多个 URL资 源; 步骤 S104: 从多个 URL资源中滤除与广告数据对应的部分 URL资源; 步骤 S106: 提示用户对滤除广告数据后剩余的 URL资源进行下载。 相关技术中, 在视频或音乐等网站的媒体文件仅提供在线播放而不支持本地下载 的情况下, 使用现有的浏览器对媒体文件进行嗅探的方式向用户提供本地下载缺乏准 确性和灵活性。 采用如图 1所示的方法, 通过对待下载资源所在网页的 DOM树进行 动态检测,获取多个 URL资源可以避免现有的浏览器嗅探技术中存在的漏检测或误检 测, 通过从多个 URL资源中滤除与广告数据对应的部分 URL资源可以避免广告媒体 文件的严重干扰,然后提示用户对滤除广告数据后剩余的 URL资源进行下载, 由此解 决了相关技术中在视频或音乐等网站的媒体文件仅提供在线播放而不支持本地下载的 情况下, 如何使用浏览器对媒体文件进行嗅探以便向用户提供本地下载的问题, 进而 使得用户在仅提供在线播放的网页中可以随意获取自身感兴趣的媒体文件, 从而极大 地提升了用户体验。 优选地, 在步骤 S102中, 对 DOM树进行实时检测, 获取多个 URL资源可以包 括以下操作: 步骤 S1 : 判断 DOM树中是否存在预设标签集合中的标签, 其中, 预设标签集合 包括以下至少之一: 视频 (video) 标签、 音频 (audio) 标签、 对象 (object) 标签; 步骤 S2: 如果存在, 则获取标签对应的超文本引用 (href) 属性, 并从 href属性 中提取多个 URL资源。 在优选实施过程中, 对 DOM树进行动态检测可以包括以下方式之一: 方式一、 按照预设周期对 D0M树进行检测; 方式二、 由网页后台脚本自动触发对 D0M树进行检测; 方式三、通过用户点击预设按钮引发的网页子资源加载事件被捕获后触发对 D0M 树进行检测。 作为本发明的一个优选实施例, 图 2是根据本发明优选实施例的对网页的媒体资 源进行动态检测方法的流程图。 如图 2 所示, 该流程可以包括以下处理步骤: 步骤 S202:在接收到浏览器内核网页加载结束信号之后,开始启动对网页的 DOM 树的检测操作; 步骤 S204: 如果检测到当前网页的 D0M 树中包含<^(^0>禾口 /或<audio>禾口 /或 <0 6 >标签时,获取该标签对应节点的超文本引用(Hypertext Reference,简称为 href) 属性, 即该音频或视频文件的下载 URL资源, 继续执行步骤 S206; 否则, 转到步骤 S208; 步骤 S206: 判断当前网页获取的下载资源 URL是否重复, 若不重复, 则可以继 续进行广告滤除操作, 本阶段结束; 若重复, 则转到步骤 S208; 步骤 S208: 启动网页子资源加载监听流程, 当网页后台 Java脚本 (JavaScript) 自动发起或者通过用户点击播放按钮引起的网页子资源加载事件被捕获后, 会再次通 知执行步骤 S202进行检测; 通过上述动态监听检测, 可以实时监控 DOM树内部节点所发生的变化, 从而避 免出现嗅探的漏检。 优选地,在步骤 S104中,从多个 URL资源中滤除部分 URL资源可以包括以下步 骤: 步骤 S3 : 接收来自于服务器的最近更新的广告拦截数据信息, 其中, 广告拦截数 据信息可以包括但不限于:部分 URL资源的标识信息(例如: 网站宿主名(HostName) 信息) 以及确定部分 URL资源为待拦截的广告数据的特征信息 (例如: 与 HostName 信息对应的广告数据的关键字信息); 步骤 S4: 采用广告拦截数据信息从多个 URL资源中滤除部分 URL资源。 作为本发明的又一个优选实施例, 图 3是根据本发明优选实施例的对网页嗅探的 资源进行广告滤除和验证方法的流程图。如图 3 所示,该流程可以包括以下处理步骤: 步骤 S302: 通过向服务器发送更新请求消息, 从服务器获取最新的广告拦截数据 表, 该广告拦截数据表可以包括两个字段: 其一是当前一些热门视频和 /或音频网站的 网站宿主名(即 HostName),其二是与该网站播放的广告对应的 URL所包含的关键字 key, 其中, HostName在广告拦截数据表中是唯一的, 而一个 HostName可以对应多 个 key; 步骤 S304: 将获取到的备选下载资源 URL与广告拦截数据进行匹配。 在该优选 实施例中, 首先可以从当前网页自身的 URL得到 HostName; 然后在广告拦截数据表 中查找该 HostName是否存在对应的 key;最后检查这些 key是否包含在备选下载资源 URL中, 如果包含则说明该下载资源 URL属于广告资源直接滤除; 否则继续执行步 骤 S306; 步骤 S306: 判断经过广告滤除的下载资源 URL是否符合 URL规范, 若不符合, 则直接滤除, 若符合, 则继续执行后续对下载文件进行命名的流程。 优选地, 在步骤 S106中, 提示用户对剩余的 URL资源进行下载可以包括以下操 作: 步骤 S5: 对剩余的 URL资源进行命名; 步骤 S6: 按照预设显示方式对剩余的 URL资源的文件名进行显示。 优选地, 在步骤 S5中, 对剩余的 URL资源进行命名可以包括以下步骤: 步骤 S51 : 将剩余的 URL资源中的每个 URL资源所在网页的标题设置为第一文 件名; 步骤 S52: 将每个 URL资源的最后 N位字符设置为第二文件名, 其中, N为正整 数; 步骤 S53 : 将第一文件名和第二文件名进行组合, 对每个 URL资源进行命名。 作为本发明的另一个优选实施例, 图 4是根据本发明优选实施例的对网页嗅探下 载文件进行命名和提示方法的流程图。 如图 4 所示, 该流程可以包括以下处理步骤: 步骤 S402: 将网页标题作为下载文件名的主文件名, 在通常情况下, 网页标题可 以包含该下载文件中较为准确的信息; 步骤 S404: 提取下载资源 URL地址后 N位 (N为正整数) 字符作为下载文件名 的副文件名, 例如: 一般提取后 8位, 主要考虑尽可能地将文件类型后缀保留下来; 步骤 S406: 将主文件名和副文件名组合在一起作为该下载资源的文件名; 步骤 S408: 提示用户在该网页上嗅探到可下载的媒体文件, 并采用列表的方式对 下载资源的文件名加以显示。 优选地, 在步骤 S106, 提示用户对剩余的 URL资源进行下载之前, 还可以包括 以下操作: 步骤 S9: 采用预设的 URL规范对剩余的 URL资源进行 URL验证。 即需要对滤 除广告后的 URL下载资源进行验证, 判断其是否符合 URL规范。 而通常情况下只有 在滤除广告后的 URL下载资源符合 URL规范时, 才会提示用户在该网页上嗅探到可 下载的媒体文件, 并采用列表的方式对下载资源的文件名加以显示, 以便用户进行下 载。 图 5是根据本发明实施例的资源的下载装置的结构框图。 如图 5所示, 该资源的 下载装置可以包括: 检测模块 10, 设置为对待下载资源所在网页的 DOM树进行动态 检测, 获取多个 URL资源; 过滤模块 20, 设置为从多个 URL资源中滤除与广告数据 对应的部分 URL资源; 处理模块 30, 设置为提示用户对滤除广告数据后剩余的 URL 资源进行下载。 采用如图 5所示的装置, 解决了相关技术中在视频或音乐等网站的媒体文件仅提 供在线播放而不支持本地下载的情况下, 如何使用浏览器对媒体文件进行嗅探以便向 用户提供本地下载的问题, 进而使得用户在仅提供在线播放的网页中可以随意获取自 身感兴趣的媒体文件, 从而极大地提升了用户体验。 优选地, 如图 6所示, 检测模块 10可以包括: 判断单元 100, 设置为判断 DOM 树中是否存在预设标签集合中的标签, 其中, 预设标签集合包括以下至少之一: video 标签、 audio标签、 object标签; 提取单元 102, 设置为在判断单元输出为是时, 获取 标签对应的 href属性, 并从 href属性中提取多个 URL资源。 优选地, 如图 6所示, 过滤模块 20可以包括: 接收单元 200, 设置为接收来自于 服务器的最近更新的广告拦截数据信息, 其中, 广告拦截数据信息包括: 部分 URL资 源的标识信息以及确定部分 URL资源为待拦截的广告数据的特征信息;过滤单元 202, 设置为采用广告拦截数据信息从多个 URL资源中滤除部分 URL资源。 优选地, 如图 6所示, 处理模块 30可以包括: 命名单元 300, 设置为对剩余的
URL资源进行命名; 显示单元 302,设置为按照预设显示方式对剩余的 URL资源的文 件名进行显示。 优选地, 命名单元 300可以包括: 第一设置子单元(图中未示出), 设置为将剩余 的 URL资源中的每个 URL资源所在网页的标题设置为第一文件名; 第二设置子单元 (图中未示出), 设置为将每个 URL资源的最后 N位字符设置为第二文件名, 其中, N为正整数;组合子单元(图中未示出),设置为将第一文件名和第二文件名进行组合, 对每个 URL资源进行命名。 优选地, 检测模块 10, 设置为按照以下方式之一对 DOM树进行动态检测: 按照 预设周期对 DOM树进行检测; 由网页后台脚本自动触发对 DOM树进行检测; 通过 用户点击预设按钮引发的网页子资源加载事件被捕获后触发对 DOM树进行检测。 优选地, 如图 6 所示, 上述装置还可以包括: 验证模块 40, 设置为采用预设的 URL规范对剩余的 URL资源进行 URL验证。 下面将结合图 7所示的优选实施方式对上述优选实施过程做进一步的描述。 图 7 是根据本发明优选实施例的可滤除广告的浏览器动态嗅探的总体架构示意 图。 如图 7所示, 在上述总体架构中, 可以包括: 媒体检测模块 (相当于上述检测模 块)、 子资源加载监听模块、广告拦截数据更新模块、 广告拦截执行模块(相当于上述 过滤模块)、 URL验证模块 (相当于上述验证模块)、 文件命名模块 (相当于上述命名 单元)、 提示和下载模块 (相当于上述处理模块的部分功能)。 各个模块实现的功能如 下: ( 1 )媒体检测模块主要负责在网页当前的文档对象模型( Obj ect Document Model, 简称为 DOM) 树中检索出相关媒体标签的节点, 并从该节点中获取对应的可下载的
(2)子资源加载监听模块主要负责监测在网页文件的整个生命周期中是否存在待 加载的子资源, 以此便于通知媒体检测模块发起再次检测, 从而可以对下载资源进行 动态嗅探。
(3 )广告拦截数据更新模块主要负责定期向服务器发送数据更新请求, 以获取最 新的广告拦截信息, 其中, 最新的广告拦截信息可以包括: 当前一些主流的音频和 / 或视频网站上的广告 URL资源的关键字。
(4)广告拦截执行模块主要负责通过服务器更新的广告拦截信息,对媒体检测模 块获取到的 URL进行检测,判断其是否包含在黑名单中(即与广告拦截信息中的关键 字相匹配) 从而对包含在黑名单中的 URL进行删除拦截。
( 5 ) URL验证模块主要负责对滤除广告后的 URL下载资源进行验证,判断其是 否符合 URL规范。
(6)文件命名模块主要负责获取当前网页的标题信息和 URL地址以便构造该下 载文件的名称。
(7)提示和下载模块主要负责将嗅探到的下载资源信息展示给用户, 并在用户选 择下载后进行下载管理。 从以上的描述中, 可以看出, 上述实施例实现了如下技术效果 (需要说明的是这 些效果是某些优选实施例可以达到的效果): 采用本发明实施例所提供的技术方案,可 以有效地解决现有的浏览器嗅探技术中存在的漏检测或误检测, 尤其是广告媒体文件 的严重干扰以及目前采用的下载文件命名方式使得用户无法分辨哪个下载文件才是用 户自身需求的文件等问题, 进而使得用户在仅提供在线播放的网页中可以随意获取自 身感兴趣的媒体文件, 从而极大地提升了用户体验。 显然, 本领域的技术人员应该明白, 上述的本发明的各模块或各步骤可以用通用 的计算装置来实现, 它们可以集中在单个的计算装置上, 或者分布在多个计算装置所 组成的网络上, 可选地, 它们可以用计算装置可执行的程序代码来实现, 从而, 可以 将它们存储在存储装置中由计算装置来执行, 并且在某些情况下, 可以以不同于此处 的顺序执行所示出或描述的步骤, 或者将它们分别制作成各个集成电路模块, 或者将 它们中的多个模块或步骤制作成单个集成电路模块来实现。 这样, 本发明不限制于任 何特定的硬件和软件结合。 以上所述仅为本发明的优选实施例而已, 并不用于限制本发明, 对于本领域的技 术人员来说, 本发明可以有各种更改和变化。 凡在本发明的精神和原则之内, 所作的 任何修改、 等同替换、 改进等, 均应包含在本发明的保护范围之内。 工业实用性 如上所述, 本发明实施例提供的一种资源的下载方法及装置具有以下有益效果: 可以有效地解决现有的浏览器嗅探技术中存在的漏检测或误检测, 尤其是广告媒体文 件的严重干扰以及目前采用的下载文件命名方式使得用户无法分辨哪个下载文件才是 用户自身需求的文件等问题, 进而使得用户在仅提供在线播放的网页中可以随意获取 自身感兴趣的媒体文件, 从而极大地提升了用户体验。

Claims

权 利 要 求 书
1. 一种资源的下载方法, 包括:
对待下载资源所在网页的文档对象模型 DOM树进行动态检测, 获取多个 统一资源定位符 URL资源;
从所述多个 URL资源中滤除与广告数据对应的部分 URL资源; 提示用户对滤除广告数据后剩余的 URL资源进行下载。
2. 根据权利要求 1所述的方法, 其中, 对所述 DOM树进行实时检测, 获取所述 多个 URL资源包括: 判断所述 DOM树中是否存在预设标签集合中的标签, 其中, 所述预设标 签集合包括以下至少之一:视频 video标签、音频 audio标签、对象 object标签; 如果存在, 则获取所述标签对应的超文本引用 href属性, 并从所述 href属 性中提取所述多个 URL资源。
3. 根据权利要求 1所述的方法,其中,从所述多个 URL资源中滤除所述部分 URL 资源包括:
接收来自于服务器的最近更新的广告拦截数据信息, 其中, 广告拦截数据 信息包括: 所述部分 URL资源的标识信息以及确定所述部分 URL资源为待拦 截的广告数据的特征信息;
采用所述广告拦截数据信息从所述多个 URL资源中滤除所述部分 URL资 源。
4. 根据权利要求 1所述的方法,其中,提示所述用户对所述剩余的 URL资源进行 下载包括:
对所述剩余的 URL资源进行命名;
按照预设显示方式对所述剩余的 URL资源的文件名进行显示。
5. 根据权利要求 4所述的方法, 其中, 对所述剩余的 URL资源进行命名包括: 将所述剩余的 URL资源中的每个 URL资源所在网页的标题设置为第一文 件名; 将所述每个 URL资源的最后 N位字符设置为第二文件名, 其中, N为正 整数; 将所述第一文件名和所述第二文件名进行组合,对所述每个 URL资源进行 命名。
6. 根据权利要求 1所述的方法, 其中, 对所述 DOM树进行动态检测包括以下之 按照预设周期对所述 DOM树进行检测; 由网页后台脚本自动触发对所述 DOM树进行检测; 通过用户点击预设按钮引发的网页子资源加载事件被捕获后触发对所述 DOM树进行检测。
7. 根据权利要求 1至 6中任一项所述的方法, 其中, 在提示所述用户对所述剩余 的 URL资源进行下载之前, 还包括: 采用预设的 URL规范对所述剩余的 URL资源进行 URL验证。
8. 一种资源的下载装置, 包括:
检测模块, 设置为对待下载资源所在网页的文档对象模型 DOM树进行动 态检测, 获取多个统一资源定位符 URL资源;
过滤模块,设置为从所述多个 URL资源中滤除与广告数据对应的部分 URL 资源;
处理模块, 设置为提示用户对滤除广告数据后剩余的 URL资源进行下载。
9. 根据权利要求 8所述的装置, 其中, 所述检测模块包括: 判断单元, 设置为判断所述 DOM树中是否存在预设标签集合中的标签, 其中,所述预设标签集合包括以下至少之一:视频 video标签、音频 audio标签、 对象 object标签; 提取单元, 设置为在所述判断单元输出为是时, 获取所述标签对应的超文 本引用 href属性, 并从所述 href属性中提取所述多个 URL资源。
10. 根据权利要求 8所述的装置, 其中, 所述过滤模块包括: 接收单元, 设置为接收来自于服务器的最近更新的广告拦截数据信息, 其 中,广告拦截数据信息包括:所述部分 URL资源的标识信息以及确定所述部分 URL资源为待拦截的广告数据的特征信息;
过滤单元,设置为采用所述广告拦截数据信息从所述多个 URL资源中滤除 所述部分 URL资源。
11. 根据权利要求 8所述的装置, 其中, 所述处理模块包括:
命名单元, 设置为对所述剩余的 URL资源进行命名; 显示单元,设置为按照预设显示方式对所述剩余的 URL资源的文件名进行 显示。
12. 根据权利要求 11所述的装置, 其中, 所述命名单元包括:
第一设置子单元, 设置为将所述剩余的 URL资源中的每个 URL资源所在 网页的标题设置为第一文件名;
第二设置子单元, 设置为将所述每个 URL资源的最后 N位字符设置为第 二文件名, 其中, N为正整数; 组合子单元, 设置为将所述第一文件名和所述第二文件名进行组合, 对所 述每个 URL资源进行命名。
13. 根据权利要求 8所述的装置, 其中, 所述检测模块, 设置为按照以下方式之一 对所述 DOM树进行动态检测: 按照预设周期对所述 DOM树进行检测; 由网页后台脚本自动触发对所述 DOM树进行检测; 通过用户点击预设按钮引发的网页子资源加载事件被捕获后触发对所述 DOM树进行检测。
14. 根据权利要求 8至 13中任一项所述的装置, 其中, 所述装置还包括:
验证模块,设置为采用预设的 URL规范对所述剩余的 URL资源进行 URL 验证。
PCT/CN2014/083594 2014-06-10 2014-08-01 资源的下载方法及装置 WO2015188431A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US15/317,122 US10262341B2 (en) 2014-06-10 2014-08-01 Resource downloading method and device
EP14894424.2A EP3142020A4 (en) 2014-06-10 2014-08-01 Resource downloading method and device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410255545.7 2014-06-10
CN201410255545.7A CN105320661A (zh) 2014-06-10 2014-06-10 资源的下载方法及装置

Publications (1)

Publication Number Publication Date
WO2015188431A1 true WO2015188431A1 (zh) 2015-12-17

Family

ID=54832763

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/083594 WO2015188431A1 (zh) 2014-06-10 2014-08-01 资源的下载方法及装置

Country Status (4)

Country Link
US (1) US10262341B2 (zh)
EP (1) EP3142020A4 (zh)
CN (2) CN105279215A (zh)
WO (1) WO2015188431A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105653668A (zh) * 2015-12-29 2016-06-08 武汉理工大学 云环境中基于DOMTree的网页内容分析提取优化方法
WO2018058330A1 (zh) * 2016-09-27 2018-04-05 中兴通讯股份有限公司 广告拦截的方法、装置和浏览器、计算机存储介质
CN109389419A (zh) * 2018-08-20 2019-02-26 中国平安人寿保险股份有限公司 广告资源预加载方法、装置、计算机设备及存储介质
US11797752B1 (en) * 2022-06-21 2023-10-24 Dropbox, Inc. Identifying downloadable objects in markup language

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105279215A (zh) * 2014-06-10 2016-01-27 中兴通讯股份有限公司 资源的下载方法及装置
CN105897900A (zh) * 2016-04-22 2016-08-24 北京小米移动软件有限公司 资源获取方法及装置
CN106095869B (zh) * 2016-06-03 2020-11-06 腾讯科技(深圳)有限公司 广告信息处理方法、用户设备、后台服务器及系统
CN107220364B (zh) * 2017-06-07 2021-01-26 深圳天珑无线科技有限公司 一种信息处理的方法及装置
CN109857953A (zh) * 2018-11-08 2019-06-07 北京达佳互联信息技术有限公司 音视频分离方法、装置、电子设备及可读存储介质
CN110430243B (zh) * 2019-07-15 2022-09-27 创维集团智能科技有限公司 一种广告节目下载方法和系统
CN110955833A (zh) * 2019-11-27 2020-04-03 百度在线网络技术(北京)有限公司 一种搜索方法、装置、服务器、终端设备和介质
CN111400579A (zh) * 2020-03-02 2020-07-10 深圳市芯众云科技有限公司 智能硬件搜索引擎系统
CN112231578A (zh) * 2020-11-06 2021-01-15 山西三友和智慧信息技术股份有限公司 一种基于图和机器学习的广告拦截系统及方法
CN112632358B (zh) * 2020-12-29 2021-09-14 北京天融信网络安全技术有限公司 一种资源链接获取方法、装置、电子设备及存储介质
CN113312572A (zh) * 2021-05-17 2021-08-27 深圳市中科明望通信软件有限公司 一种资源处理方法、装置、存储介质及电子设备
CN113626737B (zh) * 2021-10-12 2022-03-11 北京天际友盟信息技术有限公司 一种识别主体链接的方法、装置、电子设备及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102033881A (zh) * 2009-09-30 2011-04-27 国际商业机器公司 用于识别网页中的广告的方法和系统
CN102270206A (zh) * 2010-06-03 2011-12-07 北京迅捷英翔网络科技有限公司 一种有效网页内容的抓取方法及装置
CN103577427A (zh) * 2012-07-25 2014-02-12 中国移动通信集团公司 基于浏览器内核的网页爬取方法、装置及包含该装置的浏览器
US20140068411A1 (en) * 2012-08-31 2014-03-06 Scott Ross Methods and apparatus to monitor usage of internet advertising networks

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030110277A1 (en) * 2001-12-10 2003-06-12 Sheng-Tzong Cheng Method and platform for using wireless multimedia files
US7861151B2 (en) * 2006-12-05 2010-12-28 Microsoft Corporation Web site structure analysis
US7814084B2 (en) * 2007-03-21 2010-10-12 Schmap Inc. Contact information capture and link redirection
CN101582075B (zh) * 2009-06-24 2011-05-11 大连海事大学 Web信息抽取系统
US20130204867A1 (en) * 2010-07-30 2013-08-08 Hewlett-Packard Development Company, Lp. Selection of Main Content in Web Pages
US8413046B1 (en) * 2011-10-12 2013-04-02 Google Inc. System and method for optimizing rich web applications
US9443012B2 (en) * 2012-01-31 2016-09-13 Ncr Corporation Method of determining http process information
WO2013167169A1 (en) * 2012-05-08 2013-11-14 Nokia Siemens Networks Oy Method and apparatus
US20140089786A1 (en) * 2012-06-01 2014-03-27 Atiq Hashmi Automated Processor For Web Content To Mobile-Optimized Content Transformation
CN103593354B (zh) * 2012-08-15 2018-09-07 腾讯科技(深圳)有限公司 一种过滤网络页面广告的方法、装置、服务器及系统
GB2509773A (en) * 2013-01-15 2014-07-16 Ibm Automatic genre determination of web content
CN103455600B (zh) * 2013-09-03 2017-06-16 小米科技有限责任公司 一种视频url抓取方法、装置及服务器设备
CN103455602B (zh) * 2013-09-03 2017-03-29 小米科技有限责任公司 一种视频url抓取方法、装置及终端设备
CN104636664B (zh) * 2013-11-08 2018-04-27 腾讯科技(深圳)有限公司 基于文档对象模型的跨站脚本攻击漏洞检测方法及装置
US9317694B2 (en) * 2013-12-03 2016-04-19 Microsoft Technology Licensing, Llc Directed execution of dynamic programs in isolated environments
US20150169430A1 (en) * 2013-12-13 2015-06-18 International Business Machines Corporation Selecting webpage test paths
US20150206169A1 (en) * 2014-01-17 2015-07-23 Google Inc. Systems and methods for extracting and generating images for display content
CN105279215A (zh) * 2014-06-10 2016-01-27 中兴通讯股份有限公司 资源的下载方法及装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102033881A (zh) * 2009-09-30 2011-04-27 国际商业机器公司 用于识别网页中的广告的方法和系统
CN102270206A (zh) * 2010-06-03 2011-12-07 北京迅捷英翔网络科技有限公司 一种有效网页内容的抓取方法及装置
CN103577427A (zh) * 2012-07-25 2014-02-12 中国移动通信集团公司 基于浏览器内核的网页爬取方法、装置及包含该装置的浏览器
US20140068411A1 (en) * 2012-08-31 2014-03-06 Scott Ross Methods and apparatus to monitor usage of internet advertising networks

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3142020A4 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105653668A (zh) * 2015-12-29 2016-06-08 武汉理工大学 云环境中基于DOMTree的网页内容分析提取优化方法
WO2018058330A1 (zh) * 2016-09-27 2018-04-05 中兴通讯股份有限公司 广告拦截的方法、装置和浏览器、计算机存储介质
CN109389419A (zh) * 2018-08-20 2019-02-26 中国平安人寿保险股份有限公司 广告资源预加载方法、装置、计算机设备及存储介质
CN109389419B (zh) * 2018-08-20 2023-10-17 中国平安人寿保险股份有限公司 广告资源预加载方法、装置、计算机设备及存储介质
US11797752B1 (en) * 2022-06-21 2023-10-24 Dropbox, Inc. Identifying downloadable objects in markup language

Also Published As

Publication number Publication date
CN105320661A (zh) 2016-02-10
CN105279215A (zh) 2016-01-27
EP3142020A4 (en) 2017-05-24
EP3142020A1 (en) 2017-03-15
US10262341B2 (en) 2019-04-16
US20170132669A1 (en) 2017-05-11

Similar Documents

Publication Publication Date Title
WO2015188431A1 (zh) 资源的下载方法及装置
US9454540B2 (en) Systems and methods for sharing files among multiple terminals
CN107491325B (zh) 在设备上管理应用程序的系统,方法及装置
WO2017088405A1 (zh) 一种app开启广告预加载方法、装置和系统
US11360834B2 (en) Application interaction method and apparatus
WO2016101635A1 (zh) 一种同步登录状态的方法、装置、设备和计算机存储介质
US20110209075A1 (en) Page resource processing method and system
CN103945259B (zh) 一种在线视频播放方法及装置
JP6665200B2 (ja) マルチメディア情報処理方法、装置及びシステム、並びにコンピュータ記憶媒体
WO2017000612A1 (zh) 在搜索过程中实现向移动终端推荐App的方法及装置
CN110929183A (zh) 一种数据处理方法、装置和机器可读介质
WO2015003664A1 (zh) 一种下载处理方法、装置、服务器及客户端设备
US20150058452A1 (en) Video loading method, device and system of mobile terminal
CN102801698A (zh) 一种基于url请求时序的恶意代码检测方法和系统
US20190387069A1 (en) Unified Content Posting
JP2012053893A (ja) 検索結果提供方法及びシステム
WO2017166297A1 (zh) WiFi热点Portal认证方法和装置
CN103905915B (zh) 在线视频嗅探下载方法及装置
CN104980793B (zh) 一种视频检测的方法及终端
EP3424005A1 (en) Counterfeit electronic device detection
WO2015010550A1 (zh) 客户端访问认证网址的方法、装置及系统
CN113672460A (zh) 一种服务监控方法及装置
US8745169B2 (en) Intelligent system of unified content posting
KR20120028754A (ko) 컨텐츠 스트리밍 장치 및 방법
WO2016177216A1 (zh) 一种内容植入的实现方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14894424

Country of ref document: EP

Kind code of ref document: A1

REEP Request for entry into the european phase

Ref document number: 2014894424

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2014894424

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 15317122

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE