CN105740419A - Method and apparatus for acquiring dynamically loaded content in webpage - Google Patents

Method and apparatus for acquiring dynamically loaded content in webpage Download PDF

Info

Publication number
CN105740419A
CN105740419A CN201610065885.2A CN201610065885A CN105740419A CN 105740419 A CN105740419 A CN 105740419A CN 201610065885 A CN201610065885 A CN 201610065885A CN 105740419 A CN105740419 A CN 105740419A
Authority
CN
China
Prior art keywords
content
dynamic
address
load
request
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610065885.2A
Other languages
Chinese (zh)
Inventor
周金剑
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Kugou Computer Technology Co Ltd
Original Assignee
Guangzhou Kugou Computer Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Kugou Computer Technology Co Ltd filed Critical Guangzhou Kugou Computer Technology Co Ltd
Priority to CN201610065885.2A priority Critical patent/CN105740419A/en
Publication of CN105740419A publication Critical patent/CN105740419A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Abstract

The present invention relates to a method and an apparatus for acquiring dynamically loaded content in a webpage, and belongs to the technical field of computers. The method comprises: acquiring a specified type of file that loads homepage content in a dynamical content loading region from a webpage source code; according to the specified type of file, obtaining a request address of subsequently loaded dynamical content in the dynamical content loading region; and according to the request address, acquiring the subsequently loaded dynamical content in the dynamical content loading region. The request address corresponding to the subsequently loaded content in the dynamical content loading region can be determined according to the content that is loaded for the first time in the dynamical content loading region in the dynamical webpage, and the subsequently loaded content in the dynamical content loading region is acquired according to the request address, so that the problem that the current network crawler can only acquire firstly loaded content of the webpage but cannot acquire updated content after dynamic webpage loading is solved, and data acquisition effects of the dynamical webpage are improved.

Description

Obtain method and the device of dynamic load content in webpage
Technical field
The present invention relates to field of computer technology, particularly to a kind of method obtaining dynamic load content in webpage and device.
Background technology
Along with the development of network technology, the different types of network data such as picture, data base, audio frequency and video occurs in a large number, and Network application and development person typically requires the network data being carried out directed acquisition magnanimity by special instrument.
In prior art, developer generally uses the data of a certain type in the web crawlers a large amount of webpage of directed acquisition.Web crawlers is the instrument of a kind of automatic download web page contents, and it can according to set crawl target, and the webpage selectively accessed in network links to relevant, to obtain required information.
In the process realizing the present invention, it is found by the applicant that prior art there are the following problems:
Ajax asynchronous technique is a kind of technology for creating quick dynamic web page, and the content loaded in the dynamic content loading area of webpage when not reloading whole webpage, can be updated by it.At present a lot of web application all adopt Ajax asynchronous technique to develop, and current web crawlers can only obtain the content that webpage loads first, and cannot dynamic web page load after the content of renewal, cause that the data acquisition effect to dynamic web page is poor.
Summary of the invention
Embodiments providing a kind of method obtaining dynamic load content in webpage and device, technical scheme is as follows:
First aspect according to embodiments of the present invention, it is provided that a kind of obtain the method for dynamic load content in webpage, including:
Obtain from web page source code this webpage dynamic content loading area load homepage content specified type file, this homepage content is the content loaded first in this dynamic content loading area;
The request address of the dynamic content of subsequent load in this dynamic content loading area is obtained according to this specified type file;
The dynamic content of subsequent load in this dynamic content loading area is obtained according to this request address.
Optionally, this obtains the request address of the dynamic content of subsequent load in this dynamic content loading area according to this specified type file, including:
Obtaining the address architecture rule of this request address, this address architecture rule is by resolving the rule that this specified type file is determined;
According to this specified type file and this this request address of address architecture rule construct.
Optionally, this is according to this specified type file and this this request address of address architecture rule construct, including:
Given content object is obtained from this specified type file;
The mark of the content of subsequent load in this dynamic load region is determined according to this given content object;
Mark and this address architecture rule of the content according to this subsequent load generate this request address.
Optionally, when comprising a plurality of content in the dynamic content of this subsequent load, this generates this request address according to mark and this address architecture rule of the content of this subsequent load, including:
The mark of the content according to this subsequent load and this address architecture rule generate at least one paged content request, and each this paged content request is for the content in the paging of a subsequent load in this dynamic load region of acquisition request.
Optionally, the content of this subsequent load be designated in this dynamic load region the mark of the Article 1 content of subsequent load, this generates at least one paged content request according to mark and this address architecture rule of the content of this subsequent load, including:
Obtain the bar number of the content comprised in each paging in this dynamic load region;
Bar number and this address architecture rule according to the content comprised in the mark of the Article 1 content of subsequent load, this each paging in this dynamic load region generate the request of this at least one paged content.
Second aspect according to embodiments of the present invention, it is provided that a kind of obtain the device of dynamic load content in webpage, this device includes:
File acquisition module, for obtaining the specified type file of the dynamic content loading area loading homepage content at this webpage from web page source code, this homepage content is the content loaded first in this dynamic content loading area;
Address obtains module, for obtaining the request address of the dynamic content of subsequent load in this dynamic content loading area according to this specified type file;
Content obtaining module, for obtaining the dynamic content of subsequent load in this dynamic content loading area according to this request address.
Optionally, this address obtains module, including:
Rule unit, for obtaining the address architecture rule of this request address, this address architecture rule is by resolving the rule that this specified type file is determined;
Address architecture unit, for according to this specified type file and this this request address of address architecture rule construct.
Optionally, this address architecture unit, including:
Object acquisition subelement, for obtaining given content object from this specified type file;
Mark obtains subelement, for determining the mark of the content of subsequent load in this dynamic load region according to this given content object;
Address generates subelement, and mark and this address architecture rule for the content according to this subsequent load generate this request address.
Optionally, when the dynamic content of this subsequent load comprises a plurality of content, this address generates subelement, mark and this address architecture rule for the content according to this subsequent load generate at least one paged content request, and each this paged content request is for the content in the paging of a subsequent load in this dynamic load region of acquisition request.
Optionally, the content of this subsequent load be designated in this dynamic load region the mark of the Article 1 content of subsequent load, this address generates subelement, for obtaining the bar number of the content comprised in each paging in this dynamic load region, generate the request of this at least one paged content according to the bar number of the content comprised in the mark of the Article 1 content of subsequent load, this each paging in this dynamic load region and this address architecture rule.
The technical scheme that the embodiment of the present invention provides can include following beneficial effect:
By obtaining the specified type file loading homepage content in dynamic content loading area from web page source code, the request address of the dynamic content of subsequent load in this dynamic content loading area is obtained according to this specified type file, the dynamic content of subsequent load in this dynamic content loading area is obtained according to this request address, the request address that the content of this dynamic content loading area subsequent load is corresponding can be determined according to the content that the dynamic content loading area in dynamic web page loads first, the content of this dynamic content loading area subsequent load is obtained according to this request address, thus solving current web crawlers can only obtain the content that webpage loads first, and cannot dynamic web page load after the problem of content that updates, improve the data acquisition effect to dynamic web page.
It should be appreciated that it is only exemplary and explanatory that above general description and details hereinafter describe, the present invention can not be limited.
Accompanying drawing explanation
Accompanying drawing herein is merged in description and constitutes the part of this specification, it is shown that meets embodiments of the invention, and is used for explaining principles of the invention together with description.
Fig. 1 is that a kind of according to an exemplary embodiment obtains the flow chart of the method for dynamic load content in webpage;
Fig. 2 is that a kind of according to another exemplary embodiment obtains the flow chart of the method for dynamic load content in webpage;
Fig. 3 is that a kind of according to an exemplary embodiment obtains the block diagram of the device of dynamic load content in webpage;
Fig. 4 is that a kind of according to another exemplary embodiment obtains the block diagram of the device of dynamic load content in webpage;
Fig. 5 is the block diagram of a kind of equipment according to an exemplary embodiment;
Fig. 6 is the block diagram of a kind of equipment according to an exemplary embodiment.
Detailed description of the invention
Here in detail exemplary embodiment being illustrated, its example representation is in the accompanying drawings.When as explained below relates to accompanying drawing, unless otherwise indicated, the same numbers in different accompanying drawings represents same or analogous key element.Embodiment described in following exemplary embodiment does not represent all embodiments consistent with the present invention.On the contrary, they only with in appended claims describe in detail, the present invention some in the example of consistent apparatus and method.
The method that the embodiment of the present invention provides, can be applicable in the electronic equipment of installation and operation application program or server.Such as, electronic equipment includes but not limited to the electronic equipments such as PC (PersonalComputer, personal computer), mobile phone, panel computer, kneetop computer and wearable device;Server can be a station server or the service cluster of multiple servers composition, or a cloud computing center.
Fig. 1 is that a kind of according to an exemplary embodiment obtains the flow chart of the method for dynamic load content in webpage, and the method may be used in electronic equipment or server, as it is shown in figure 1, the method can include following several step.
In a step 101, from web page source code, obtain the specified type file loading homepage content in dynamic content loading area.
In a step 102, the request address of the dynamic content of subsequent load in this dynamic content loading area is obtained according to this specified type file.
In step 103, the dynamic content of subsequent load in this dynamic content loading area is obtained according to this request address.
nullIn sum,What the present embodiment provided obtains the method for dynamic load content in webpage,By obtaining the specified type file loading homepage content in dynamic content loading area from web page source code,The request address of the dynamic content of subsequent load in this dynamic content loading area is obtained according to this specified type file,The dynamic content of subsequent load in this dynamic content loading area is obtained according to this request address,The request address that the content of this dynamic content loading area subsequent load is corresponding can be determined according to the content that the dynamic content loading area in dynamic web page loads first,The content of this dynamic content loading area subsequent load is obtained according to this request address,Thus solving current web crawlers can only obtain the content that webpage loads first,And cannot dynamic web page load after the problem of content that updates,Improve the data acquisition effect to dynamic web page.
Fig. 2 is that a kind of according to another exemplary embodiment obtains the flow chart of the method for dynamic load content in webpage, and the method may be used in electronic equipment or server, as in figure 2 it is shown, the method can include following several step.
In step 201, from web page source code, obtain the specified type file loading homepage content in dynamic content loading area.
Wherein, dynamic content loading area is to load the region of dynamic content in webpage, this dynamic content can refresh according to user operation, or, refresh according to default interval, the homepage content of dynamic content loading area is this webpage when loading first, the content loaded in this dynamic content loading area.Specified type file can directly obtain from web page source code, such as, it is possible to obtains by the code that dynamic content loading area from web page source code is corresponding, and under windows environment, this specified type file is generally js file.
Such as, this dynamic content loading area can be the comment region of a certain item content in webpage, when webpage loads first, can a loading section user comment or be not loaded with user comment in this comment region, when user clicks a certain control (such as the clicking the control of " showing more comment ") in this comment region, webpage only refreshes the content in this comment region, to show more or whole user comments.
In step 202., obtaining the address architecture rule of this request address, this address architecture rule is by resolving the rule that this specified type file is determined.
Wherein, this address architecture rule can be passed through code analysis script and resolve specified type file and obtain, or, it is also possible to by developer's manual analysis specified type file and arrange.
In step 203, according to this specified type file and this this request address of address architecture rule construct.
Concrete, electronic equipment or server for obtaining web data can obtain given content object from above-mentioned specified type file, determine the mark of the content of subsequent load in this dynamic load region according to this given content object, and the mark and this address architecture rule according to the content of this subsequent load generates this request address.
Optionally, when the dynamic content of this subsequent load comprises a plurality of content, this generates this request address according to mark and this address architecture rule of the content of this subsequent load, including: generating at least one paged content request according to the mark of the content of this subsequent load and this address architecture rule, each this paged content request is for the content in the paging of a subsequent load in this dynamic load region of acquisition request.
Follow-up in a dynamic load region would generally load more than one content; when the content of subsequent load is more; webpage would generally divide multipage to load; load one page of content every time; in order to improve the acquisition effect of dynamic content, the method shown in the embodiment of the present invention can also according to paging load mode in dynamic load region in webpage, and paging obtains the content of subsequent load; namely generate the request of a plurality of paged content, obtain every one page of content of subsequent load respectively.
Optionally, the content of this subsequent load be designated in this dynamic load region the mark of the Article 1 content of subsequent load, this generates at least one paged content request according to mark and this address architecture rule of the content of this subsequent load, including: obtain the bar number of the content comprised in each paging in this dynamic load region;Bar number and this address architecture rule according to the content comprised in the mark of the Article 1 content of subsequent load, this each paging in this dynamic load region generate the request of this at least one paged content.
nullWherein,The a plurality of content loaded in dynamic load region carries out paging loading generally in a certain order,Such as,User comment is loaded in dynamic load region,Generally according to the time of delivering of user comment from closely carrying out paging loading to remote order,Each content has respective mark (being generally numbering),And the mark of adjacent two contents is also adjacent,Therefore,Namely the mark of the Article 1 content of subsequent load can be obtained according to the mark of Article 1 content in homepage content and the content bar number in homepage content,Or,The mark of the Article 1 content of subsequent load can also be obtained according to the mark of the last item content in homepage content,Such as,If the numbering 2298 of Article 1 content in homepage content,The content bar number of homepage content is 20,Then can obtain subsequent load Article 1 content be numbered 2278,Or,If the numbering 2279 of Article 1 content in homepage content,Then can obtain subsequent load Article 1 content be numbered 2278.Or, when homepage content does not comprise any content, electronic equipment or server can intercept and capture the page 1 content (i.e. the next page content of homepage content) of subsequent load in dynamic load region, parse the numbering of the Article 1 content of subsequent load.
After obtaining the mark of Article 1 content of subsequent load, namely can generate at least one paged content request according to the bar number of the content comprised in the mark of the Article 1 content of this subsequent load, each paging and address architecture rule.
In step 204, the dynamic content of subsequent load in this dynamic content loading area is obtained according to this request address.
The process of the scheme shown in the embodiment of the present invention can be such that
The web application site crawled needed for opening, what now record on web page source code is the oss message of the web page contents that current web site loads out first;Determine the dynamic load region obtaining subsequent load content, obtain the homepage content in this dynamic load region, from examining of webpage, element is found the specified type file (file finding dynamic content demand body to comprise that specified type file refers to) loading out, being generally js file, the content of pages that in this dynamic load region, subsequent load is out new will not record on web page source code;After finding js file, resolve means resolution file by program code, obtain required content of text, namely above-mentioned homepage content;The address architecture rule of the Ajax dynamic content demand body in analysis js file, thus constructing the request address (as a rule request rule general point three kinds: construct, partial content is from other file, and manually generates such as timestamp etc) of the dynamic content of subsequent load according to rule;After determining the address architecture rule of dynamic content of subsequent load, by constructing and send paged content request, receive and parse through the information of return, multiple different page can be obtained and load out content.
Concrete, the review information obtaining dynamic load in certain video website with illustrated method (includes comment content, comment people's title and comment time) for example, in web page source code, the object of a certain video is vid object, know by analyzing web page source code, if getting dynamic load comment content out, then first have to construct dynamic comment url address, and in the url address of structure comment, generation is dynamically obtained by analyzing the comment_id in the url address drawing comment, so before structure comment url, first have to get comment_id, and obtain the address of comment_id, need again to rely on vid by analyzing, determined by analysis source code and can find vid object in web page source code file.When obtaining comment content, first one reptile class of definition, in apoplexy due to endogenous wind initialization section, the initial address of the targeted website that definition to obtain, dynamically crawl the request body address of comment content and relate to the address information of the structure data of other association when constructing request body address;Definition parse method, the method for getting vid object from web page source code, thus splicing sns_url address, the address spliced the most at last is adjusted back to parse_id method by callback;Parse_id method goes out comment_id by the sns_url address resolution returned, again comment_id is passed to comment_url, namely, in comment address, thus getting the url of comment, then the url address callback of comment is adjusted back to parse_comment method;Parse_comment method is responsible for resolving the incoming comment url come, and by commenting on url request comment content, and Context resolution request returned is out, obtains crawling required comment content.The code of its user comment obtaining subsequent load can be such that
Examples detailed above code pinup picture does not show that paging realizes process, but its implementation is similar with the flow process of above-mentioned offer, by, after the comment_id value that obtains, constructing different comment_id initial values and comment on number of requests plus reqnum, the comment content in different paging can be obtained.Namely the basic procedure of said method is:
First get vid object--> be spliced into sns_url--> resolve after obtain comment_id--> splice comment_url--> dynamically obtain and parse the content that needs crawl;Similar, when paging obtains comment content, it is possible to carry out the request of comment content to reach paging effect by different comment_id and reqnum.
It should be noted that, the embodiment of the present invention only illustrates obtaining the review information of dynamic load in video website, in actual applications, the method that the embodiment of the present invention provides can also be applied to obtain the information loaded in other type website by Ajax Technique dynamic, and this is not limited by the embodiment of the present invention.
nullIn sum,What the embodiment of the present invention provided obtains the method for dynamic load content in webpage,By obtaining the specified type file loading homepage content in dynamic content loading area from web page source code,The request address of the dynamic content of subsequent load in this dynamic content loading area is obtained according to this specified type file,The dynamic content of subsequent load in this dynamic content loading area is obtained according to this request address,The request address that the content of this dynamic content loading area subsequent load is corresponding can be determined according to the content that the dynamic content loading area in dynamic web page loads first,The content of this dynamic content loading area subsequent load is obtained according to this request address,Thus solving current web crawlers can only obtain the content that webpage loads first,And cannot dynamic web page load after the problem of content that updates,Improve the data acquisition effect to dynamic web page.
Fig. 3 obtains the block diagram of the device of dynamic load content in webpage according to a kind of of exemplary embodiment offer.Device can be applicable in electronic equipment or server, as it is shown on figure 3, this device comprises the steps that
File acquisition module 301, for obtaining the specified type file of the dynamic content loading area loading homepage content at described webpage from web page source code, described homepage content is the content loaded first in described dynamic content loading area;
Address obtains module 302, for obtaining the request address of the dynamic content of subsequent load in described dynamic content loading area according to described specified type file;
Content obtaining module 303, for obtaining the dynamic content of subsequent load in described dynamic content loading area according to described request address.
nullIn sum,What the present embodiment provided obtains the device of dynamic load content in webpage,By obtaining the specified type file loading homepage content in dynamic content loading area from web page source code,The request address of the dynamic content of subsequent load in this dynamic content loading area is obtained according to this specified type file,The dynamic content of subsequent load in this dynamic content loading area is obtained according to this request address,The request address that the content of this dynamic content loading area subsequent load is corresponding can be determined according to the content that the dynamic content loading area in dynamic web page loads first,The content of this dynamic content loading area subsequent load is obtained according to this request address,Thus solving current web crawlers can only obtain the content that webpage loads first,And cannot dynamic web page load after the problem of content that updates,Improve the data acquisition effect to dynamic web page.
Fig. 4 obtains the block diagram of the device of dynamic load content in webpage according to a kind of of another exemplary embodiment offer.Device can be applicable in above-mentioned electronic equipment or server, and as shown in Figure 4, this device comprises the steps that
File acquisition module 401, for obtaining the specified type file of the dynamic content loading area loading homepage content at described webpage from web page source code, described homepage content is the content loaded first in described dynamic content loading area;
Address obtains module 402, for obtaining the request address of the dynamic content of subsequent load in described dynamic content loading area according to described specified type file;
Content obtaining module 403, for obtaining the dynamic content of subsequent load in described dynamic content loading area according to described request address.
Optionally, described address obtains module 402, including:
Rule unit 402a, for obtaining the address architecture rule of described request address, described address architecture rule is by resolving the rule that described specified type file is determined;
Address architecture unit 402b, for request address according to described specified type file and described address architecture rule construct.
Optionally, described address architecture unit 402b, including:
Object acquisition subelement 402b1, for obtaining given content object from described specified type file;
Mark obtains subelement 402b2, for determining the mark of the content of subsequent load in described dynamic load region according to described given content object;
Address generates subelement 402b3, and mark and described address architecture rule for the content according to described subsequent load generate described request address.
Optionally, when comprising a plurality of content in the dynamic content of described subsequent load, described address generates subelement 402b3, mark and described address architecture rule for the content according to described subsequent load generate at least one paged content request, the content in the paging of a subsequent load in each described paged content request dynamic load region described in acquisition request.
Optionally, the content of described subsequent load be designated in described dynamic load region the mark of the Article 1 content of subsequent load, described address generates subelement 402b3, for obtaining the bar number of the content comprised in each paging in described dynamic load region, generate described at least one paged content request according to the bar number of the content comprised in the mark of the Article 1 content of subsequent load, described each paging in described dynamic load region and described address architecture rule.
nullIn sum,What the embodiment of the present invention provided obtains the device of dynamic load content in webpage,By obtaining the specified type file loading homepage content in dynamic content loading area from web page source code,The request address of the dynamic content of subsequent load in this dynamic content loading area is obtained according to this specified type file,The dynamic content of subsequent load in this dynamic content loading area is obtained according to this request address,The request address that the content of this dynamic content loading area subsequent load is corresponding can be determined according to the content that the dynamic content loading area in dynamic web page loads first,The content of this dynamic content loading area subsequent load is obtained according to this request address,Thus solving current web crawlers can only obtain the content that webpage loads first,And cannot dynamic web page load after the problem of content that updates,Improve the data acquisition effect to dynamic web page.
Fig. 5 is the structural representation of the equipment 500 provided according to one embodiment of the invention.Such as, equipment 500 can be a PC or server.With reference to Fig. 5, equipment 500 includes processing assembly 522, and it farther includes one or more processor and the memory resource representated by memorizer 532, for storing the instruction that can be performed by processing component 522, for instance application program.In memorizer 532 application program of storage can include one or more each corresponding to the module of one group of instruction.It is configured to perform instruction additionally, process assembly 522, to perform the method for dynamic load content in above-mentioned acquisition webpage.
Equipment 500 can also include a power supply module 526 and be configured to the power management of execution equipment 500, and a wired or wireless network interface 550 is configured to be connected to equipment 500 network and input and output (I/O) interface 558.Equipment 500 can operate based on the operating system being stored in memorizer 532, for instance WindowsServerTM, MacOSXTM, UnixTM, LinuxTM, FreeBSDTM or similar.
Refer to Fig. 6, it illustrates the structural representation of the equipment 600 that one embodiment of the invention provides.This equipment can be the electronic equipments such as mobile phone, panel computer, kneetop computer and wearable device.Specifically:
Equipment 600 can include RF circuit 610, includes the memorizer 620 of one or more computer-readable recording mediums, input block 630, display unit 640, sensor 650, voicefrequency circuit 660, WiFi (wirelessfidelity, Wireless Fidelity) module 670, includes the parts such as processor 680 and power supply 690 of or more than one process core.It will be understood by those skilled in the art that the device structure shown in Fig. 6 is not intended that the restriction to equipment, it is possible to include ratio and illustrate more or less of parts, or combine some parts, or different parts are arranged.Wherein:
RF circuit 610 can be used for receiving and sending messages or in communication process, the reception of signal and transmission, especially, after being received by the downlink information of base station, transfers to one or more than one processor 680 processes;It addition, be sent to base station by relating to up data.Usual RF circuit includes but not limited to antenna, at least one amplifier, tuner, one or more agitator, subscriber identity module (SIM) card, transceiver, bonder, LNA (LowNoiseAmplifier, low-noise amplifier), duplexer etc..Communicate additionally, RF circuit can also pass through radio communication with network and other equipment.nullRadio communication can use arbitrary communication standard or agreement,Include but not limited to GSM (GlobalSystemofMobilecommunication,Global system for mobile communications)、GPRS(GeneralPacketRadioService,General packet radio service)、CDMA(CodeDivisionMultipleAccess,CDMA)、WCDMA(WidebandCodeDivisionMultipleAccess,WCDMA)、LTE(LongTermEvolution,Long Term Evolution)、Email、SMS(ShortMessagingService,Short Message Service) etc..
Memorizer 620 can be used for storing software program and module.Processor 680 is stored in software program and the module of memorizer 620 by running, thus performing the application of various function and data process.Memorizer 620 can mainly include storage program area and storage data field, and wherein, storage program area can store the application program (such as sound-playing function, image player function etc.) etc. needed for operating system, at least one function;Storage data field can store the data (such as voice data, phone directory etc.) etc. that the use according to equipment 600 creates.Additionally, memorizer 620 can include high-speed random access memory, it is also possible to include nonvolatile memory, for instance at least one disk memory, flush memory device or other volatile solid-state parts.Correspondingly, memorizer 620 can also include Memory Controller, to provide processor 680 and the input block 630 access to memorizer 620.
Input block 630 can be used for receiving numeral or the character information of input, and produce the keyboard relevant with user setup and function control, mouse, action bars, optics or trace ball signal and input.Specifically, input block 630 can include Touch sensitive surface 631 and other input equipments 632.Touch sensitive surface 631, also referred to as touching display screen or Trackpad, user can be collected thereon or neighbouring touch operation (such as user uses any applicable object such as finger, stylus or adnexa operation on Touch sensitive surface 631 or near Touch sensitive surface 631), and drive corresponding connecting device according to formula set in advance.Optionally, Touch sensitive surface 631 can include touch detecting apparatus and two parts of touch controller.Wherein, the touch orientation of touch detecting apparatus detection user, and detect the signal that touch operation brings, transmit a signal to touch controller;Touch controller receives touch information from touch detecting apparatus, and is converted into contact coordinate, then gives processor 680, and can receive order that processor 680 sends and be performed.Furthermore, it is possible to adopt the polytypes such as resistance-type, condenser type, infrared ray and surface acoustic wave to realize Touch sensitive surface 631.Except Touch sensitive surface 631, input block 630 can also include other input equipments 632.Specifically, other input equipments 632 can include but not limited to one or more in physical keyboard, function key (such as volume control button, switch key etc.), trace ball, mouse, action bars etc..
Display unit 640 can be used for showing the various graphical user interface of information or the information being supplied to user and the equipment 100 inputted by user, and these graphical user interface can be made up of figure, text, icon, video and its combination in any.Display unit 640 can include display floater 641, optionally, the form such as LCD (LiquidCrystalDisplay, liquid crystal display), OLED (OrganicLight-EmittingDiode, Organic Light Emitting Diode) can be adopted to configure display floater 641.Further, Touch sensitive surface 631 can cover on display floater 641, when Touch sensitive surface 631 detects thereon or after neighbouring touch operation, send processor 680 to determine the type of touch event, on display floater 641, provide corresponding visual output with preprocessor 680 according to the type of touch event.Although in figure 6, Touch sensitive surface 631 and display floater 641 are to realize input and input function as two independent parts, but in some embodiments it is possible to by integrated to Touch sensitive surface 631 and display floater 641 and realize input and output function.
Equipment 600 may also include at least one sensor 650, such as optical sensor, motion sensor and other sensors.Specifically, optical sensor can include ambient light sensor and proximity transducer, and wherein, ambient light sensor can regulate the brightness of display floater 641 according to the light and shade of ambient light, proximity transducer when equipment 600 moves in one's ear, can cut out display floater 641 and/or backlight.One as motion sensor, Gravity accelerometer can detect the size of the acceleration that (is generally three axles) in all directions, can detect that the size of gravity and direction time static, can be used for identifying the application (such as horizontal/vertical screen switching, dependent game, magnetometer pose calibrating) of mobile phone attitude, Vibration identification correlation function (such as pedometer, knock) etc.;Other sensors such as the gyroscope that can also configure as equipment 600, barometer, drimeter, thermometer, infrared ray sensor, do not repeat them here.
Voicefrequency circuit 660, speaker 621, microphone 622 can provide the audio interface between user and equipment 600.Voicefrequency circuit 660 can by receive voice data conversion after the signal of telecommunication, be transferred to speaker 621, by speaker 621 be converted to acoustical signal output;On the other hand, the acoustical signal of collection is converted to the signal of telecommunication by microphone 622, voice data is converted to after being received by voicefrequency circuit 660, after again voice data output processor 680 being processed, through RF circuit 610 to be sent to another equipment, or voice data is exported to memorizer 620 to process further.Voicefrequency circuit 660 is also possible that earphone jack, to provide the communication of peripheral hardware earphone and equipment 600.
WiFi belongs to short range wireless transmission technology, and equipment 600 can help user to send and receive e-mail by WiFi module 670, browse webpage and access streaming video etc., and it has provided the user wireless broadband internet and has accessed.Although Fig. 6 illustrates WiFi module 670, but it is understood that, it is also not belonging to must be configured into of equipment 600, completely can as needed in do not change invention essence scope in and omit.
Processor 680 is the control centre of equipment 600, utilize various interface and the various piece of the whole equipment of connection, it is stored in the software program in memorizer 620 and/or module by running or performing, and call the data being stored in memorizer 620, the various functions of execution equipment 600 and process data, thus carrying out integral monitoring to equipment.Optionally, processor 680 can include one or more process core;Optionally, processor 680 can integrated application processor and modem processor, wherein, application processor mainly processes operating system, user interface and application program etc., and modem processor mainly processes radio communication.It is understood that above-mentioned modem processor can not also be integrated in processor 680.
Equipment 600 also includes the power supply 690 (such as battery) powered to all parts, preferably, it is logically contiguous with processor 680 that power supply can pass through power-supply management system, realizes the functions such as management charging, electric discharge and power managed thereby through power-supply management system.Power supply 690 can also include one or more direct current or alternating current power supply, recharging system, power failure detection circuit, power supply changeover device or the random component such as inverter, power supply status indicator.
Although not shown, equipment 600 can also include photographic head, bluetooth module etc., does not repeat them here.
Equipment 600 also includes memorizer, and one or more than one program, one of them or more than one program are stored in memorizer, and be configured to be performed by one or more than one processor, electronic equipment 600 is made to be able to carry out shown in above-mentioned Fig. 1 or Fig. 2, electronic equipment all or part of step obtained in webpage in the method for dynamic load content performed.
Those skilled in the art, after considering description and putting into practice invention disclosed herein, will readily occur to other embodiment of the present invention.The application is intended to any modification of the present invention, purposes or adaptations, and these modification, purposes or adaptations are followed the general principle of the present invention and include the undocumented known general knowledge in the art of the present invention or conventional techniques means.Description and embodiments is considered only as exemplary, and the true scope of the present invention and spirit are pointed out by claim below.
It should be appreciated that the invention is not limited in precision architecture described above and illustrated in the accompanying drawings, and various amendment and change can carried out without departing from the scope.The scope of the present invention is only limited by appended claim.

Claims (10)

1. one kind obtains the method for dynamic load content in webpage, it is characterised in that described method includes:
Obtain from web page source code described webpage dynamic content loading area load homepage content specified type file, described homepage content is the content loaded first in described dynamic content loading area;
The request address of the dynamic content of subsequent load in described dynamic content loading area is obtained according to described specified type file;
The dynamic content of subsequent load in described dynamic content loading area is obtained according to described request address.
2. method according to claim 1, it is characterised in that described according to the request address of the dynamic content of subsequent load in the described specified type file described dynamic content loading area of acquisition, including:
Obtaining the address architecture rule of described request address, described address architecture rule is by resolving the rule that described specified type file is determined;
Request address according to described specified type file and described address architecture rule construct.
3. method according to claim 2, it is characterised in that described request address according to described specified type file and described address architecture rule construct, including:
Given content object is obtained from described specified type file;
The mark of the content of subsequent load in described dynamic load region is determined according to described given content object;
The mark of the content according to described subsequent load and described address architecture rule generate described request address.
4. method according to claim 3, it is characterised in that when comprising a plurality of content in the dynamic content of described subsequent load, the mark of the described content according to described subsequent load and described address architecture rule generate described request address, including:
The mark of the content according to described subsequent load and described address architecture rule generate at least one paged content request, the content in the paging of a subsequent load in each described paged content request dynamic load region described in acquisition request.
5. method according to claim 4, it is characterized in that, the content of described subsequent load be designated in described dynamic load region the mark of the Article 1 content of subsequent load, the mark of the described content according to described subsequent load and described address architecture rule generate at least one paged content request, including:
Obtain the bar number of the content comprised in each paging in described dynamic load region;
Bar number and described address architecture rule according to the content comprised in the mark of the Article 1 content of subsequent load, described each paging in described dynamic load region generate described at least one paged content request.
6. one kind obtains the device of dynamic load content in webpage, it is characterised in that described device includes:
File acquisition module, for obtaining the specified type file of the dynamic content loading area loading homepage content at described webpage from web page source code, described homepage content is the content loaded first in described dynamic content loading area;
Address obtains module, for obtaining the request address of the dynamic content of subsequent load in described dynamic content loading area according to described specified type file;
Content obtaining module, for obtaining the dynamic content of subsequent load in described dynamic content loading area according to described request address.
7. device according to claim 6, it is characterised in that described address obtains module, including:
Rule unit, for obtaining the address architecture rule of described request address, described address architecture rule is by resolving the rule that described specified type file is determined;
Address architecture unit, for request address according to described specified type file and described address architecture rule construct.
8. device according to claim 7, it is characterised in that described address architecture unit, including:
Object acquisition subelement, for obtaining given content object from described specified type file;
Mark obtains subelement, for determining the mark of the content of subsequent load in described dynamic load region according to described given content object;
Address generates subelement, and mark and described address architecture rule for the content according to described subsequent load generate described request address.
9. device according to claim 8, it is characterized in that, when comprising a plurality of content in the dynamic content of described subsequent load, described address generates subelement, mark and described address architecture rule for the content according to described subsequent load generate at least one paged content request, the content in the paging of a subsequent load in each described paged content request dynamic load region described in acquisition request.
10. device according to claim 9, it is characterized in that, the content of described subsequent load be designated in described dynamic load region the mark of the Article 1 content of subsequent load, described address generates subelement, for obtaining the bar number of the content comprised in each paging in described dynamic load region, generate described at least one paged content request according to the bar number of the content comprised in the mark of the Article 1 content of subsequent load, described each paging in described dynamic load region and described address architecture rule.
CN201610065885.2A 2016-01-29 2016-01-29 Method and apparatus for acquiring dynamically loaded content in webpage Pending CN105740419A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610065885.2A CN105740419A (en) 2016-01-29 2016-01-29 Method and apparatus for acquiring dynamically loaded content in webpage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610065885.2A CN105740419A (en) 2016-01-29 2016-01-29 Method and apparatus for acquiring dynamically loaded content in webpage

Publications (1)

Publication Number Publication Date
CN105740419A true CN105740419A (en) 2016-07-06

Family

ID=56248133

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610065885.2A Pending CN105740419A (en) 2016-01-29 2016-01-29 Method and apparatus for acquiring dynamically loaded content in webpage

Country Status (1)

Country Link
CN (1) CN105740419A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106227763A (en) * 2016-07-13 2016-12-14 珠海市魅族科技有限公司 The method and device that a kind of webpage loads
CN107329976A (en) * 2017-05-26 2017-11-07 深圳市小牛在线互联网信息咨询有限公司 Webpage paging method, device, computer equipment and computer-readable recording medium
CN107977424A (en) * 2017-11-27 2018-05-01 山东浪潮商用系统有限公司 A kind of web page interactive system and method
CN108388607A (en) * 2018-02-06 2018-08-10 北京奇艺世纪科技有限公司 A kind of page display method and device
CN109815083A (en) * 2018-12-21 2019-05-28 瑞庭网络技术(上海)有限公司 A kind of monitoring method of application crashes, device, electronic equipment and medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101515300A (en) * 2009-04-02 2009-08-26 阿里巴巴集团控股有限公司 Method and system for grabbing Ajax webpage content
CN101520796A (en) * 2009-02-16 2009-09-02 深圳市腾讯计算机系统有限公司 Method and system for extracting uniform resource locators from web page content
CN102662966A (en) * 2012-03-08 2012-09-12 中国科学院计算机网络信息中心 Method and system for obtaining subject-oriented dynamic page content
US8819819B1 (en) * 2011-04-11 2014-08-26 Symantec Corporation Method and system for automatically obtaining webpage content in the presence of javascript

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101520796A (en) * 2009-02-16 2009-09-02 深圳市腾讯计算机系统有限公司 Method and system for extracting uniform resource locators from web page content
CN101515300A (en) * 2009-04-02 2009-08-26 阿里巴巴集团控股有限公司 Method and system for grabbing Ajax webpage content
US8819819B1 (en) * 2011-04-11 2014-08-26 Symantec Corporation Method and system for automatically obtaining webpage content in the presence of javascript
CN102662966A (en) * 2012-03-08 2012-09-12 中国科学院计算机网络信息中心 Method and system for obtaining subject-oriented dynamic page content

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于RHINO的JAVASCRIPT动态页面解析研究与实现: "基于Rhino的JavaScript动态页面解析研究与实现", 《计算机技术与发展》 *
屈武江等: "基于Ajax技术的ASP.net 数据分页", 《计算机系统应用》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106227763A (en) * 2016-07-13 2016-12-14 珠海市魅族科技有限公司 The method and device that a kind of webpage loads
CN107329976A (en) * 2017-05-26 2017-11-07 深圳市小牛在线互联网信息咨询有限公司 Webpage paging method, device, computer equipment and computer-readable recording medium
CN107329976B (en) * 2017-05-26 2020-07-14 深圳市小牛在线互联网信息咨询有限公司 Webpage paging method and device, computer equipment and computer readable storage medium
CN107977424A (en) * 2017-11-27 2018-05-01 山东浪潮商用系统有限公司 A kind of web page interactive system and method
CN108388607A (en) * 2018-02-06 2018-08-10 北京奇艺世纪科技有限公司 A kind of page display method and device
CN109815083A (en) * 2018-12-21 2019-05-28 瑞庭网络技术(上海)有限公司 A kind of monitoring method of application crashes, device, electronic equipment and medium

Similar Documents

Publication Publication Date Title
CN103617165B (en) Picture loading method, device and terminal
CN104618217B (en) Share method, terminal, server and the system of resource
CN107204964B (en) Authority management method, device and system
CN105528297A (en) Method and device for testing web page
CN105740419A (en) Method and apparatus for acquiring dynamically loaded content in webpage
CN104219617A (en) Service acquiring method and device
CN107766358B (en) Page sharing method and related device
CN105740145B (en) The method and device of element in orient control
CN108536594A (en) Page test method, device and storage device
CN104065693A (en) Method, device and system for accessing network data in webpage applications
CN104699501B (en) A kind of method and device for running application program
CN103607377B (en) Information sharing method, Apparatus and system
CN104267882A (en) Page suspension frame display method and device
CN103607431B (en) Mobile terminal resource processing method, device and equipment
CN106293738A (en) The update method of a kind of facial expression image and device
CN104735132A (en) Information inquiry method, servers and terminal
CN105955597A (en) Method and device for displaying information
CN112749074B (en) Test case recommending method and device
CN103944922B (en) Data processing method, terminal, server and system
CN105512150A (en) Method and device for information search
CN104965831A (en) Method, server, terminal and system for correcting website addresses
CN106155888A (en) The detection method of webpage loading performance and device in a kind of Mobile solution
CN104391629A (en) Method for sending message in orientation manner, method for displaying message, server and terminal
CN105631059B (en) Data processing method, data processing device and data processing system
CN105094872B (en) A kind of method and apparatus showing web application

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 510660 Guangzhou City, Guangzhou, Guangdong, Whampoa Avenue, No. 315, self - made 1-17

Applicant after: Guangzhou KuGou Networks Co., Ltd.

Address before: 510000 B1, building, No. 16, rhyme Road, Guangzhou, Guangdong, China 13F

Applicant before: Guangzhou KuGou Networks Co., Ltd.

RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20160706