CN106294848A - A kind of web analysis, acquisition methods and device - Google Patents

A kind of web analysis, acquisition methods and device Download PDF

Info

Publication number
CN106294848A
CN106294848A CN201610700420.XA CN201610700420A CN106294848A CN 106294848 A CN106294848 A CN 106294848A CN 201610700420 A CN201610700420 A CN 201610700420A CN 106294848 A CN106294848 A CN 106294848A
Authority
CN
China
Prior art keywords
domain name
webpage
address
label
parsed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610700420.XA
Other languages
Chinese (zh)
Inventor
徐佳宏
朱吕亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Ipanel TV Inc
Original Assignee
Shenzhen Ipanel TV Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Ipanel TV Inc filed Critical Shenzhen Ipanel TV Inc
Priority to CN201610700420.XA priority Critical patent/CN106294848A/en
Publication of CN106294848A publication Critical patent/CN106294848A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]

Abstract

This application discloses a kind of web analysis, acquisition methods and device, domain name with the hyperlink of Webpage correlation is write in the label of webpage head in advance by the application webpage development personnel when developing webpage, and specify target identification for label, and then browser is carrying out resolving to webpage to be shown, obtain the label of webpage head target identification, each domain name that label is comprised can be got, and then at web analysis concurrent process, each domain name is resolved, obtain IP address corresponding to domain name and preserve, avoid when the resource that the domain name of certain hyperlink in user's requested webpage is corresponding, domain name is resolved brought time loss temporarily, reduce period of reservation of number.

Description

A kind of web analysis, acquisition methods and device
Technical field
The application relates to technical field of webpage processing, more particularly, it relates to a kind of web analysis, acquisition methods and dress Put.
Background technology
User is when browsing webpage, it is common that open remote web page by the form of domain name.As browsed Sina's portal Stand http://www.sina.com.cn, and www.sina.com.cn is exactly the domain name of Sina website.User has only at browser Address field inputs this domain name, it is possible to opens the webpage on remote server and carries out browsing.Browser and remote server Between data interaction, use IP network, it is necessary to use IP address just can carry out.Therefore, browser needs first to user The domain name of input resolves, and after resolving to IP address, can access the remote server that IP address is corresponding.
Webpage has a lot of hyperlinks, points to other webpages or resource.If this hyperlink points to one Other webpages of individual domain name form or resource, then browser needs first this domain name addresses to be resolved to the IP ground of server Location, just can carry out data loading.Existing browser treatment mechanism is, when user opens a webpage comprising hyperlink and touches When sending out this hyperlink of click, browser response user operation, the domain name that this hyperlink is corresponding is resolved, obtains IP ground Location.And then download data according to this IP address.Owing to domain name resolution process can consume certain time, response time therefore can be caused Problem that is long, that increase period of reservation of number.
Summary of the invention
In view of this, this application provides a kind of web analysis, acquisition methods and device, work as user solving prior art When triggering the hyperlink in webpage, it is long that browser carries out the response time that domain name mapping caused temporarily, increases user and waits The problem of time.
To achieve these goals, it is proposed that scheme as follows:
A kind of web analysis method, including:
When the webpage to be shown obtained is resolved, obtain the label of the target identification of described webpage head to be shown, Described label includes the domain name of the hyperlink with described Webpage correlation to be shown;
The each domain name being comprised described label carries out pre-parsed, obtains the IP address that each domain name is corresponding;
IP address corresponding for each domain name is preserved, in order to the super chain of target in asking described webpage to be shown During resource corresponding to the domain name that connects, the IP address that the domain name with described target hyperlink that inquiry preserves is corresponding, and based on looking into The IP address ask carries out the download of resource.
Preferably, when the described webpage to be shown to obtaining resolves, the target of described webpage head to be shown is obtained The label of mark, including:
When webpage to be shown is resolved, obtain the meta that name value is desired value of described webpage head to be shown Label.
Preferably, after the label of the target identification of the described webpage head to be shown of described acquisition, the method also includes:
The each domain name comprised by described label is added to domain name pre-parsed queue;
The described each domain name being comprised described label carries out pre-parsed, obtains the IP address that each domain name is corresponding, bag Include:
Call background thread, each domain name in domain name pre-parsed queue is carried out pre-parsed, obtain each described The IP address that domain name is corresponding.
Preferably, also include:
When the IP address judging domain name and the correspondence preserved reaches to lose efficacy in limited time, the domain name being up to the inefficacy time limit is added To domain name pre-parsed queue.
A kind of webpage acquisition methods, based on web analysis method described above, this webpage loading method includes:
Receive the triggering command of target hyperlink in webpage;
In described target hyperlink, extract domain name, and inquire about the domain name and the corresponding relation list of IP address stored, IP address corresponding to domain name determined and extract;Wherein, domain name and the corresponding relation list of IP address record have, described Each domain name that the label of the target identification of the webpage head obtained during web analysis is comprised, and to each domain name pre-parsed gained The corresponding IP address arrived;
According to the IP address corresponding with the domain name extracted determined, access the server that this IP address is corresponding, obtain webpage Data.
A kind of web analysis device, including:
Domain Name acquisition unit, during for resolving the webpage to be shown obtained, obtains described webpage head to be shown The label of target identification, described label includes the domain name of the hyperlink with described Webpage correlation to be shown;
Domain name pre-parsed unit, carries out pre-parsed for each domain name being comprised described label, obtains each domain name Corresponding IP address;
Corresponding relation storage unit, for preserving IP address corresponding for each domain name, in order to described in request During the resource that in webpage to be shown, the domain name of target hyperlink is corresponding, the domain name with described target hyperlink that inquiry preserves Corresponding IP address, and the download of resource is carried out based on the IP address inquired.
Preferably, domain name acquiring unit includes:
Meta label acquiring unit, for when resolving webpage to be shown, obtains described webpage head to be shown The meta label that name value is desired value.
Preferably, also include:
First queue adding device, after the label at the target identification of the described webpage head to be shown of acquisition, will Each domain name that described label comprises is added to domain name pre-parsed queue;
Domain name pre-parsed unit includes:
Backstage pre-parsed unit, is used for calling background thread, enters each domain name in domain name pre-parsed queue Row pre-parsed, obtains the IP address that each domain name is corresponding.
Preferably, also include:
Second queue adding device, during for reaching in the IP address judging domain name and the correspondence preserved to lose efficacy in limited time, will The domain name reaching the inefficacy time limit is added to domain name pre-parsed queue.
A kind of webpage acquisition device, based on web analysis device described above, it is characterised in that this webpage acquisition device Including:
Triggering command receives unit, for receiving the triggering command of target hyperlink in webpage;
IP address lookup unit, for extracting domain name in described target hyperlink, and inquire about the domain name that stored and The corresponding relation list of IP address, determines the IP address corresponding with the domain name extracted;Wherein, domain name and IP address corresponding relation In list, record has, each domain name that the label of the target identification of the webpage head obtained when described web analysis is comprised, with And to the corresponding IP address obtained by each domain name pre-parsed;
IP address access unit, for according to the IP address corresponding with the domain name extracted determined, accessing this IP address pair The server answered, obtains web data.
From above-mentioned technical scheme it can be seen that the web analysis method of the embodiment of the present application offer, wait to show to acquisition Showing that webpage resolves, obtain the label of the target identification of described webpage head to be shown, described label includes to be treated with described The domain name of the hyperlink of display Webpage correlation;The each domain name being comprised described label carries out pre-parsed, obtains each described territory The IP address that name is corresponding;IP address corresponding for each domain name is preserved, in order to mesh in asking described webpage to be shown During the resource corresponding to domain name of mark hyperlink, the IP address that the domain name with described target hyperlink that inquiry preserves is corresponding, And the download of resource is carried out based on the IP address inquired.It follows that the application webpage development personnel are pre-when developing webpage First the domain name with the hyperlink of Webpage correlation is write in the label of webpage head, and specify target identification for label, and then Browser is carrying out resolving to webpage to be shown, obtains the label of webpage head target identification, can get label institute The each domain name comprised, and then at web analysis concurrent process, each domain name is resolved, obtain IP address corresponding to domain name and protect Deposit, it is to avoid when the resource that the domain name of certain hyperlink in user's requested webpage is corresponding, carry out domain name resolving institute temporarily The time loss brought, reduces period of reservation of number.
Accompanying drawing explanation
In order to be illustrated more clearly that the embodiment of the present application or technical scheme of the prior art, below will be to embodiment or existing In having technology to describe, the required accompanying drawing used is briefly described, it should be apparent that, the accompanying drawing in describing below is only this The embodiment of application, for those of ordinary skill in the art, on the premise of not paying creative work, it is also possible to according to The accompanying drawing provided obtains other accompanying drawing.
Fig. 1 is a kind of web analysis method flow diagram disclosed in the embodiment of the present application;
Fig. 2 is another kind of web analysis method flow diagram disclosed in the embodiment of the present application;
Fig. 3 is the embodiment of the present application another web analysis method flow diagram disclosed;
Fig. 4 is a kind of webpage acquisition methods flow chart disclosed in the embodiment of the present application;
Fig. 5 is a kind of web analysis apparatus structure schematic diagram disclosed in the embodiment of the present application;
Fig. 6 is a kind of webpage acquisition device structural representation disclosed in the embodiment of the present application.
Detailed description of the invention
Before introducing the application scheme first to literary composition in the professional term that can mention explain:
1.1IP address
IP address refers to that (English: Internet Protocol Address, is translated into again internet association to Internet protocol address View address), it is the abbreviation of IP Address.IP address is a kind of unified address format that IP agreement provides, and it is the Internet On each network and one logical address of each host assignment, shield the difference of physical address with this.
1.2 domain name
Domain name (Domain Name), a certain computer on the Internet being made up of the name of a string separation Or calculate unit title, for when data transmits mark computer electronic bearing (sometimes referred to as geographical position, geographically Domain name, refer to administrative autonomy power a local area).The purpose of one domain name is easy for memory and the one group of clothes linked up The address (website, Email, FTP etc.) of business device.IP address is the numeric type as routing addressing of Internet main frame Mark, people is not easy memory.Thus create this kind of character type mark of domain name (domain name).
1.3DNS (domain name system)
DNS (Domain Name System, domain name system), that the Internet mutually maps as domain name and IP address Individual distributed data base, it is possible to make user more easily access the Internet, and do not spend and remember can be directly read by machine IP number string.By host name, finally give the process of IP address corresponding to this host name and be called domain name mapping (or host name solution Analysis).DNS Protocol operates on udp protocol, uses port numbers 53.In RFC document, DNS is had specification to illustrate by RFC 2181, Dynamically updating of DNS is illustrated by RFC 2136, and the inverted cache of DNS query is illustrated by RFC 2308.
1.4DNS server
Dns server refers to preserve in this network the domain name of All hosts and corresponding IP address, and has and domain name turned It is changed to the server of IP address function.The wherein necessary corresponding IP address of domain name, can there be multiple domain name an IP address, and Not necessarily there is domain name IP address.Domain name system uses the hierarchical organization of similar directory tree.Name server is usually client computer/clothes Server side in business device pattern, it mainly has two kinds of forms: master server and forwarding server.Domain name is mapped as IP address Process be known as " domain name mapping ".
1.5 link
Link refers to Transfer Parameters and control command between each module of computer programs, and they are formed one The process of individual executable entirety.Link also referred to as hyperlink, refers to point to the annexation of a target, institute from a webpage The target pointed to can be another webpage, it is also possible to be the diverse location in same web page, it is also possible to be picture, Email Address, file, even application program.
1.6Meta label
<meta>element can provide the metamessage (meta-information) about the page, such as search engine and The description of update frequency and key word.<meta>label is positioned at the head of document, does not comprise any content.The genus of<meta>label Property defines the name/value pair being associated with document.
Below in conjunction with the accompanying drawing in the embodiment of the present application, the technical scheme in the embodiment of the present application is carried out clear, complete Describe, it is clear that described embodiment is only some embodiments of the present application rather than whole embodiments wholely.Based on Embodiment in the application, it is every other that those of ordinary skill in the art are obtained under not making creative work premise Embodiment, broadly falls into the scope of the application protection.
First the application introduces prior art by an instantiation.
When a hyperlink is hit at user's webpage midpoint: http://www.ipanel.cn/index.htm, in chronological sequence Sequentially, whole handling process approximately as:
1, browser resolves address, obtains domain name www.ipanel.cn;
2, browser connects dns server, sends domain name inquiry request;
3, dns server returns to IP address corresponding to domain name to browser;
4, browser passes through IP address, sets up socket with WEB server and is connected;
5, web browser sends HTTP request, and request header is GET/index.htm HTTP/1.1;
6, WEB server receives request, reads index.htm, content is returned to browser from file system;
7, browser receives the content of pages of index.htm, start to resolve, render, typesetting, drawing, complete the page and show.
8, browser cuts out and connects;
9, server closing connects.
By above-mentioned flow process it can be seen that after hyperlink in user's webpage clicking, browser is just to this hyperlink ground connection Location carries out domain name mapping, the IP address corresponding by determining domain name alternately with dns server, can access IP address pair afterwards The server answered.Obviously, domain name resolution process will take certain time so that response time increases, and period of reservation of number adds Long, affect Consumer's Experience.
To this end, this application provides a kind of web analysis method, it is a kind of disclosed in the embodiment of the present application for seeing Fig. 1, Fig. 1 Web analysis method flow diagram.
As it is shown in figure 1, the method includes:
Step S100, to obtain webpage to be shown resolve, obtain the target identification of described webpage head to be shown Label, described label includes the domain name of the hyperlink with described Webpage correlation to be shown;
Specifically, webpage development personnel determine the domain name of the hyperlink with Webpage correlation, such as net when developing webpage Hyperlink included in Ye and/or the domain name of the hyperlink included in two grades of pages of webpage.Determining domain name Afterwards, domain name is write in the label of webpage head, and identify for label target setting, in order to browser identification.
It is understood that the number of the domain name comprised in label does not limit, can be one or more, concrete view Depending on Ye.
Based on this, browser to obtain webpage to be shown resolve time, the first head of analyzing web page, obtain mesh The label of mark mark, and then each domain name included in label can be obtained.
Step S110, each domain name being comprised described label carry out pre-parsed, obtain the IP ground that each domain name is corresponding Location;
Specifically, after resolving acquisition domain name, each domain name is carried out pre-parsed, obtain the IP ground that each domain name is corresponding Location.
It is understood that the process to domain name pre-parsed can be to perform, also with web analysis course synchronization to be shown I.e. while resolving webpage to be shown, the domain name obtained is carried out pre-parsed, obtain corresponding IP address.Certainly, due to target The label of mark is positioned at webpage head, and web analysis process is usually according to from first to last order, therefore can first resolve Label to target identification.At this point it is possible to each domain name directly comprised label carries out pre-parsed, complete without webpage by the time Portion is parsed.
Optionally, the mode of domain name pre-parsed is it may be that send domain name to dns server, by dns server inquiry field The IP address that name is corresponding, and then return to browser.
Step S120, IP address corresponding for each domain name is preserved, in order in asking described webpage to be shown During resource corresponding to the domain name of target hyperlink, the IP ground that the domain name with described target hyperlink that inquiry preserves is corresponding Location, and the download of resource is carried out based on the IP address inquired.
Specifically, after pre-parsed gets the IP address that domain name is corresponding, the IP address of domain name and correspondence is carried out Preserve.By preserving domain name and the IP address of correspondence, subsequent user is the domain name of target hyperlink in asking webpage to be shown During corresponding resource, can directly inquiry in the domain name preserved with IP address corresponding relation, it is to avoid domain name is solved temporarily Analysis, thus improve webpage opening speed.
The web analysis method that the embodiment of the present application provides, resolves the webpage to be shown obtained, treats described in acquisition The label of the target identification of display webpage head, described label includes the territory of the hyperlink with described Webpage correlation to be shown Name;The each domain name being comprised described label carries out pre-parsed, obtains the IP address that each domain name is corresponding;By each domain name Corresponding IP address preserves, in order to the resource that the domain name of target hyperlink is corresponding in asking described webpage to be shown Time, inquire about the IP address that the domain name with described target hyperlink preserved is corresponding, and provide based on the IP address inquired The download in source.It follows that the application webpage development personnel develop webpage time in advance by with the hyperlink of Webpage correlation In the label of domain name write webpage head, and specify target identification for label, and then webpage to be shown is being solved by browser Analysis process, obtains the label of webpage head target identification, can get each domain name that label is comprised, and then at web analysis Each domain name is resolved by concurrent process, obtain IP address corresponding to domain name and preserve, it is to avoid when in user's requested webpage certain During resource corresponding to the domain name of individual hyperlink, domain name is resolved brought time loss temporarily, reduces user etc. Treat the time.
Seeing Fig. 2, Fig. 2 is another kind of web analysis method flow diagram disclosed in the embodiment of the present application.
Step S200, resolving webpage to be shown, the name value obtaining described webpage head to be shown is desired value Meta label, described meta label includes the domain name of the hyperlink with described Webpage correlation to be shown;
Specifically, webpage development personnel determine the domain name of the hyperlink with Webpage correlation, such as net when developing webpage Hyperlink included in Ye and/or the domain name of the hyperlink included in two grades of pages of webpage.Determining domain name Afterwards, domain name is write in the meta label of webpage head, and the name value of meta label is set as desired value, in order to browse Device identification.
Illustrate such as:<meta name=" dns " content=www.example.com, example2.com/>
Wherein, the value of this meta label is dns.Meta label comprises altogether two domain names, is respectively as follows: Www.example.com and www.example2.com.Wherein, different domain names are in the content attribute of meta label, with solid Determining separator to separate, separator described above is comma, ".
Step S210, each domain name being comprised described label carry out pre-parsed, obtain the IP ground that each domain name is corresponding Location;
Specifically, after resolving acquisition domain name, each domain name is carried out pre-parsed, obtain the IP ground that each domain name is corresponding Location.
It is understood that the process to domain name pre-parsed can be to perform, also with web analysis course synchronization to be shown I.e. while resolving webpage to be shown, the domain name obtained is carried out pre-parsed, obtain corresponding IP address.Certainly, due to target The label of mark is positioned at webpage head, and web analysis process is usually according to from first to last order, therefore can first resolve Label to target identification.At this point it is possible to each domain name directly comprised label carries out pre-parsed, complete without webpage by the time Portion is parsed.
Step S220, IP address corresponding for each domain name is preserved, in order in asking described webpage to be shown During resource corresponding to the domain name of target hyperlink, the IP ground that the domain name with described target hyperlink that inquiry preserves is corresponding Location, and the download of resource is carried out based on the IP address inquired.
The present embodiment describes the specific implementation obtaining the domain name that webpage to be shown is comprised, namely analyzing web page Obtain the meta label that name value is desired value of webpage head, and then obtain each domain name that this label is comprised.
Seeing Fig. 3, Fig. 3 is the embodiment of the present application another web analysis method flow diagram disclosed.
As it is shown on figure 3, the method includes:
Step S300, resolving webpage to be shown, the name value obtaining described webpage head to be shown is desired value Meta label, described meta label includes the domain name of the hyperlink with described Webpage correlation to be shown;
Specifically, webpage development personnel determine the domain name of the hyperlink with Webpage correlation, such as net when developing webpage Hyperlink included in Ye and/or the domain name of the hyperlink included in two grades of pages of webpage.Determining domain name Afterwards, domain name is write in the meta label of webpage head, and the name value of meta label is set as desired value, in order to browse Device identification.
Step S310, each domain name comprised by described label are added to domain name pre-parsed queue;
Specifically, the application can pre-set a domain name pre-parsed queue.For each domain name comprised in label, add Add in this domain name pre-parsed queue.
Step S320, call background thread, each domain name in domain name pre-parsed queue is carried out pre-parsed, obtains Take the IP address that each domain name is corresponding;
Specifically, can arrange a thread in this step on backstage, this thread is specifically designed to domain name pre-parsed queue In domain name resolve.The thread work time can be Tong Bu execution with the time of browser resolves webpage.
Step S330, IP address corresponding for each domain name is preserved, in order in asking described webpage to be shown During resource corresponding to the domain name of target hyperlink, the IP ground that the domain name with described target hyperlink that inquiry preserves is corresponding Location, and the download of resource is carried out based on the IP address inquired.
In the present embodiment, describe the domain name obtained by queue form storing and resolving, and call background thread to queue In domain name resolve, the process of process and browser resolves webpage that thread resolves domain name can be parallel.
Optionally, on the basis of the various embodiments described above, the application can also be to the domain name preserved and the IP address of correspondence The inefficacy time limit is set.Judge to prescribe a time limit when reaching to lose efficacy in the IP address of domain name and the correspondence preserved in detection, be up to the time limit of losing efficacy Domain name add in domain name pre-parsed queue.Added to domain name pre-parsed by the domain name being up to the inefficacy time limit In queue, background thread this domain name is re-started parsing, obtain corresponding up-to-date IP address, and set up this up-to-date IP Relation between address with corresponding domain name.
Wherein, domain name can store as follows with IP corresponding relation:
Domain name (key) IP address (value)
www.example1.com xxx.xxx.xxx.xxx
www.example2.com xxx.xxx.xxx.xxx
Table 1
Web analysis methods based on the various embodiments described above, the embodiment of the present application further provides a kind of webpage acquisition side Method, after i.e. in user is to webpage, certain hyperlink triggers, browser obtains the processing procedure of corresponding webpage, sees figure 4, Fig. 4 is a kind of webpage acquisition methods flow chart disclosed in the embodiment of the present application.
As shown in Figure 4, the method includes:
Step S400, reception are to the triggering command of target hyperlink in webpage;
Specifically, often carrying hyperlink in the webpage that user browses, user can be to target to be browsed Hyperlink triggers, and such as click etc., browser receives the triggering command of user.
Step S410, in described target hyperlink, extract domain name, and inquire about the domain name that stored and IP address is corresponding Relation list, determines the IP address corresponding with the domain name extracted;
Wherein, wherein, domain name and the corresponding relation list of IP address record and has, obtain when described web analysis Each domain name that the label of the target identification of webpage head is comprised, and to the corresponding IP ground obtained by each domain name pre-parsed Location.Acquisition mode for domain name and IP address corresponding relation is referred to the introduction of the various embodiments described above, and here is omitted.
In this step, extraction domain name from target hyperlink, and inquire about the corresponding relation of storage, the territory determining with extracting The IP address that name is corresponding.
Step S420, according to determine with IP address corresponding to domain name extracted, access the server that this IP address is corresponding, Obtain web data.
Specifically, it is referred to prior art according to the process of IP address acquisition web data.
The domain name of the hyperlink included webpage the most in advance due to the application resolves, and has obtained the IP of correspondence Address.Therefore, when user triggers target hyperlink, can eliminate territory directly in the IP address that local search is corresponding Name carries out the link resolved, and accelerates the page and is loaded into the time, reduces period of reservation of number.
Below to the embodiment of the present application provide web analysis device be described, web analysis device described below with Above-described web analysis method can be mutually to should refer to.
Wherein, the undocumented details of device item is referred to the introduction of method item embodiment.
Seeing Fig. 5, Fig. 5 is a kind of web analysis apparatus structure schematic diagram disclosed in the embodiment of the present application.
As it is shown in figure 5, this device includes:
Domain Name acquisition unit 51, during for resolving the webpage to be shown obtained, obtains described webpage head to be shown The label of the target identification in portion, described label includes the domain name of the hyperlink with described Webpage correlation to be shown;
Domain name pre-parsed unit 52, carries out pre-parsed for each domain name being comprised described label, obtains each described territory The IP address that name is corresponding;
Corresponding relation storage unit 53, for preserving IP address corresponding for each domain name, in order in request institute When stating the resource that in webpage to be shown, the domain name of target hyperlink is corresponding, the territory with described target hyperlink that inquiry preserves The IP address that name is corresponding, and the download of resource is carried out based on the IP address inquired.
The web analysis device that the embodiment of the present application provides, resolves the webpage to be shown obtained, treats described in acquisition The label of the target identification of display webpage head, described label includes the territory of the hyperlink with described Webpage correlation to be shown Name;The each domain name being comprised described label carries out pre-parsed, obtains the IP address that each domain name is corresponding;By each domain name Corresponding IP address preserves, in order to the resource that the domain name of target hyperlink is corresponding in asking described webpage to be shown Time, inquire about the IP address that the domain name with described target hyperlink preserved is corresponding, and provide based on the IP address inquired The download in source.It follows that the application webpage development personnel develop webpage time in advance by with the hyperlink of Webpage correlation In the label of domain name write webpage head, and specify target identification for label, and then webpage to be shown is being solved by browser Analysis process, obtains the label of webpage head target identification, can get each domain name that label is comprised, and then at web analysis Each domain name is resolved by concurrent process, obtain IP address corresponding to domain name and preserve, it is to avoid when in user's requested webpage certain During resource corresponding to the domain name of individual hyperlink, domain name is resolved brought time loss temporarily, reduces user etc. Treat the time.
Optionally, domain name acquiring unit may include that
Meta label acquiring unit, for resolving webpage to be shown, obtains described webpage head to be shown Name value is the meta label of desired value.
Optionally, the device of the application can also include:
First queue adding device, after the label at the target identification of the described webpage head to be shown of acquisition, will Each domain name that described label comprises is added to domain name pre-parsed queue.
Based on this, domain name pre-parsed unit may include that
Backstage pre-parsed unit, is used for calling background thread, enters each domain name in domain name pre-parsed queue Row pre-parsed, obtains the IP address that each domain name is corresponding.
Optionally, the device of the application can also include:
Second queue adding device, during for reaching in the IP address judging domain name and the correspondence preserved to lose efficacy in limited time, will The domain name reaching the inefficacy time limit is added to domain name pre-parsed queue.
Further, the webpage acquisition device providing the embodiment of the present application is described, and webpage described below obtains dress Putting can be mutually to should refer to above-described webpage acquisition methods.
Webpage acquisition device disclosed in the present application web analysis based on above-described embodiment device, seeing Fig. 6, Fig. 6 is this A kind of webpage acquisition device structural representation disclosed in application embodiment.
As shown in Figure 6, this device includes:
Triggering command receives unit 61, for receiving the triggering command of target hyperlink in webpage;
IP address lookup unit 62, for extracting domain name in described target hyperlink, and inquires about the domain name stored And the corresponding relation list of IP address, determine the IP address corresponding with the domain name extracted;Wherein, domain name and IP address correspondence are closed In series of tables, record has, each domain name that the label of the target identification of the webpage head obtained when described web analysis is comprised, And to the corresponding IP address obtained by each domain name pre-parsed;
IP address access unit 63, for according to the IP address corresponding with the domain name extracted determined, accessing this IP address Corresponding server, obtains web data.
The domain name of the hyperlink included webpage the most in advance due to the application resolves, and has obtained the IP of correspondence Address.Therefore, when user triggers target hyperlink, can eliminate territory directly in the IP address that local search is corresponding Name carries out the link resolved, and accelerates the page and is loaded into the time, reduces period of reservation of number.
Finally, in addition it is also necessary to explanation, in this article, the relational terms of such as first and second or the like be used merely to by One entity or operation separate with another entity or operating space, and not necessarily require or imply these entities or operation Between exist any this reality relation or order.And, term " includes ", " comprising " or its any other variant meaning Containing comprising of nonexcludability, so that include that the process of a series of key element, method, article or equipment not only include that A little key elements, but also include other key elements being not expressly set out, or also include for this process, method, article or The key element that equipment is intrinsic.In the case of there is no more restriction, statement " including ... " key element limited, do not arrange Except there is also other identical element in including the process of described key element, method, article or equipment.
In this specification, each embodiment uses the mode gone forward one by one to describe, and what each embodiment stressed is and other The difference of embodiment, between each embodiment, identical similar portion sees mutually.
Described above to the disclosed embodiments, makes professional and technical personnel in the field be capable of or uses the application. Multiple amendment to these embodiments will be apparent from for those skilled in the art, as defined herein General Principle can realize in the case of without departing from spirit herein or scope in other embodiments.Therefore, the application It is not intended to be limited to the embodiments shown herein, and is to fit to and principles disclosed herein and features of novelty phase one The widest scope caused.

Claims (10)

1. a web analysis method, it is characterised in that including:
The webpage to be shown obtained is resolved, obtains the label of the target identification of described webpage head to be shown, described mark Sign the domain name including the hyperlink with described Webpage correlation to be shown;
The each domain name being comprised described label carries out pre-parsed, obtains the IP address that each domain name is corresponding;
IP address corresponding for each domain name is preserved, in order to target hyperlink in asking described webpage to be shown During resource corresponding to domain name, the IP address that the domain name with described target hyperlink that inquiry preserves is corresponding, and based on inquiring IP address carry out the download of resource.
Web analysis method the most according to claim 1, it is characterised in that the described webpage to be shown to obtaining solves Analysis, obtains the label of the target identification of described webpage head to be shown, including:
Webpage to be shown is resolved, obtains the meta label that name value is desired value of described webpage head to be shown.
Web analysis method the most according to claim 1 and 2, it is characterised in that at the described webpage to be shown of described acquisition After the label of the target identification of head, the method also includes:
The each domain name comprised by described label is added to domain name pre-parsed queue;
The described each domain name being comprised described label carries out pre-parsed, obtains the IP address that each domain name is corresponding, including:
Call background thread, each domain name in domain name pre-parsed queue is carried out pre-parsed, obtains each domain name Corresponding IP address.
Web analysis method the most according to claim 3, it is characterised in that also include:
When the IP address judging domain name and the correspondence preserved reaches to lose efficacy in limited time, the domain name being up to the inefficacy time limit is added to institute State in domain name pre-parsed queue.
5. a webpage acquisition methods, it is characterised in that based on the web analysis method described in any one of claim 1-4, should Webpage loading method includes:
Receive the triggering command of target hyperlink in webpage;
In described target hyperlink, extract domain name, and inquire about the domain name and the corresponding relation list of IP address stored, determine The IP address corresponding with the domain name extracted;Wherein, domain name and the corresponding relation list of IP address record have, at described webpage Each domain name that the label of target identification of the webpage head obtained during parsing is comprised, and to each domain name pre-parsed obtained by Corresponding IP address;
According to the IP address corresponding with the domain name extracted determined, access the server that this IP address is corresponding, obtain web data.
6. a web analysis device, it is characterised in that including:
Domain Name acquisition unit, during for resolving the webpage to be shown obtained, obtains the mesh of described webpage head to be shown The label of mark mark, described label includes the domain name of the hyperlink with described Webpage correlation to be shown;
Domain name pre-parsed unit, carries out pre-parsed for each domain name being comprised described label, obtains each domain name corresponding IP address;
Corresponding relation storage unit, for preserving IP address corresponding for each domain name, in order to is waiting to show described in request When showing the resource that in webpage, the domain name of target hyperlink is corresponding, the domain name with described target hyperlink that inquiry preserves is corresponding IP address, and carry out the download of resource based on the IP address inquired.
Web analysis device the most according to claim 6, it is characterised in that domain name acquiring unit includes:
Meta label acquiring unit, for resolving webpage to be shown, obtains the name value of described webpage head to be shown Meta label for desired value.
8. according to the web analysis device described in claim 6 or 7, it is characterised in that also include:
First queue adding device, for after obtaining the label of target identification of described webpage head to be shown, by described Each domain name that label comprises is added to domain name pre-parsed queue;
Domain name pre-parsed unit includes:
Backstage pre-parsed unit, is used for calling background thread, carries out pre-to each domain name in domain name pre-parsed queue Resolve, obtain the IP address that each domain name is corresponding.
Web analysis device the most according to claim 8, it is characterised in that also include:
Second queue adding device, during for reaching in the IP address judging domain name and the correspondence preserved to lose efficacy in limited time, is up to The domain name in inefficacy time limit is added to domain name pre-parsed queue.
10. a webpage acquisition device, it is characterised in that based on the web analysis device described in any one of claim 6-9, its Being characterised by, this webpage acquisition device includes:
Triggering command receives unit, for receiving the triggering command of target hyperlink in webpage;
IP address lookup unit, for extracting domain name in described target hyperlink, and inquires about the domain name stored and IP ground Location corresponding relation list, determines the IP address corresponding with the domain name extracted;Wherein, domain name and the corresponding relation list of IP address Middle record has, each domain name that the label of the target identification of the webpage head obtained when described web analysis is comprised, and right Corresponding IP address obtained by each domain name pre-parsed;
IP address access unit, for according to the IP address corresponding with the domain name extracted determined, accessing this IP address corresponding Server, obtains web data.
CN201610700420.XA 2016-08-22 2016-08-22 A kind of web analysis, acquisition methods and device Pending CN106294848A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610700420.XA CN106294848A (en) 2016-08-22 2016-08-22 A kind of web analysis, acquisition methods and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610700420.XA CN106294848A (en) 2016-08-22 2016-08-22 A kind of web analysis, acquisition methods and device

Publications (1)

Publication Number Publication Date
CN106294848A true CN106294848A (en) 2017-01-04

Family

ID=57661888

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610700420.XA Pending CN106294848A (en) 2016-08-22 2016-08-22 A kind of web analysis, acquisition methods and device

Country Status (1)

Country Link
CN (1) CN106294848A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107798061A (en) * 2017-09-18 2018-03-13 维沃移动通信有限公司 A kind of webpage loading method and mobile terminal
CN107835267A (en) * 2017-11-15 2018-03-23 维沃移动通信有限公司 Domain name analytic method and device
CN110913027A (en) * 2018-09-14 2020-03-24 北京微播视界科技有限公司 Domain name resolution method and device
CN108536603B (en) * 2018-04-16 2021-03-02 哈尔滨工业大学 Automatic testing method for Web browser behaviors aiming at new top-level domain name
CN116185497A (en) * 2023-01-06 2023-05-30 格兰菲智能科技有限公司 Command analysis method, device, computer equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6643641B1 (en) * 2000-04-27 2003-11-04 Russell Snyder Web search engine with graphic snapshots
CN102882991A (en) * 2012-09-29 2013-01-16 北京奇虎科技有限公司 Browser and domain name resolution method thereof
CN103685604A (en) * 2013-12-20 2014-03-26 北京奇虎科技有限公司 Domain name pre-resolution method and domain name pre-resolution device
CN104135546A (en) * 2014-07-25 2014-11-05 可牛网络技术(北京)有限公司 Method for loading webpage and terminal

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6643641B1 (en) * 2000-04-27 2003-11-04 Russell Snyder Web search engine with graphic snapshots
CN102882991A (en) * 2012-09-29 2013-01-16 北京奇虎科技有限公司 Browser and domain name resolution method thereof
CN103685604A (en) * 2013-12-20 2014-03-26 北京奇虎科技有限公司 Domain name pre-resolution method and domain name pre-resolution device
CN104135546A (en) * 2014-07-25 2014-11-05 可牛网络技术(北京)有限公司 Method for loading webpage and terminal

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107798061A (en) * 2017-09-18 2018-03-13 维沃移动通信有限公司 A kind of webpage loading method and mobile terminal
CN107835267A (en) * 2017-11-15 2018-03-23 维沃移动通信有限公司 Domain name analytic method and device
CN108536603B (en) * 2018-04-16 2021-03-02 哈尔滨工业大学 Automatic testing method for Web browser behaviors aiming at new top-level domain name
CN110913027A (en) * 2018-09-14 2020-03-24 北京微播视界科技有限公司 Domain name resolution method and device
CN116185497A (en) * 2023-01-06 2023-05-30 格兰菲智能科技有限公司 Command analysis method, device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
CN100424694C (en) Implementing method of network profile
CN106294848A (en) A kind of web analysis, acquisition methods and device
CN103389983B (en) A kind of capturing webpage contents method and device for network crawler system
CN102882991B (en) A kind of browser and carry out the method for domain name mapping
CN106301928A (en) A kind of web analysis, acquisition methods and device
CN102075570B (en) Method for implementing HTTP (hyper text transport protocol) message caching mechanism based on keywords
CN102968341B (en) The method and apparatus of different editions IE kernels based on many kernel browsers switching
CN103685604B (en) A kind of domain name pre-parsed method and device
EP2002362B1 (en) Method and system for providing improved url mangling performance using fast re-write
JPH11195025A (en) Linking device for document data, display and access device for link destination address and distribution device for linked document data
US20150100563A1 (en) Method for retaining search engine optimization in a transferred website
CN105528452A (en) Method and system for loading page data
US20140188871A1 (en) Tld markup language
CN108702396A (en) For the method for data processing, equipment and computer program and hierarchical domain name system area file
US7970758B2 (en) Automatic completion with LDAP
CN102968448B (en) A kind of browser
CN104065736B (en) A kind of URL reorientation methods, apparatus and system
CN107070988A (en) Message processing method and device
CN101551813A (en) Network connection apparatus, search equipment and method for collecting search engine data source
CN103617225B (en) A kind of associating web pages searching method and system
CN104253796B (en) Quick area&#39;s recognition methods based on network address binding region layer level in domain name system
CN1960371B (en) Method and system for accessing file of Web application program
JP3889667B2 (en) Computer network connection method on the Internet by real name and computer network system thereof
CN107229653A (en) Pseudo- static Web page generation method and device
CN101231655A (en) Method and system for processing search engine results

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170104

RJ01 Rejection of invention patent application after publication