CN106294848A - A kind of web analysis, acquisition methods and device - Google Patents
A kind of web analysis, acquisition methods and device Download PDFInfo
- Publication number
- CN106294848A CN106294848A CN201610700420.XA CN201610700420A CN106294848A CN 106294848 A CN106294848 A CN 106294848A CN 201610700420 A CN201610700420 A CN 201610700420A CN 106294848 A CN106294848 A CN 106294848A
- Authority
- CN
- China
- Prior art keywords
- domain name
- webpage
- address
- label
- parsed
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/955—Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
Abstract
This application discloses a kind of web analysis, acquisition methods and device, domain name with the hyperlink of Webpage correlation is write in the label of webpage head in advance by the application webpage development personnel when developing webpage, and specify target identification for label, and then browser is carrying out resolving to webpage to be shown, obtain the label of webpage head target identification, each domain name that label is comprised can be got, and then at web analysis concurrent process, each domain name is resolved, obtain IP address corresponding to domain name and preserve, avoid when the resource that the domain name of certain hyperlink in user's requested webpage is corresponding, domain name is resolved brought time loss temporarily, reduce period of reservation of number.
Description
Technical field
The application relates to technical field of webpage processing, more particularly, it relates to a kind of web analysis, acquisition methods and dress
Put.
Background technology
User is when browsing webpage, it is common that open remote web page by the form of domain name.As browsed Sina's portal
Stand http://www.sina.com.cn, and www.sina.com.cn is exactly the domain name of Sina website.User has only at browser
Address field inputs this domain name, it is possible to opens the webpage on remote server and carries out browsing.Browser and remote server
Between data interaction, use IP network, it is necessary to use IP address just can carry out.Therefore, browser needs first to user
The domain name of input resolves, and after resolving to IP address, can access the remote server that IP address is corresponding.
Webpage has a lot of hyperlinks, points to other webpages or resource.If this hyperlink points to one
Other webpages of individual domain name form or resource, then browser needs first this domain name addresses to be resolved to the IP ground of server
Location, just can carry out data loading.Existing browser treatment mechanism is, when user opens a webpage comprising hyperlink and touches
When sending out this hyperlink of click, browser response user operation, the domain name that this hyperlink is corresponding is resolved, obtains IP ground
Location.And then download data according to this IP address.Owing to domain name resolution process can consume certain time, response time therefore can be caused
Problem that is long, that increase period of reservation of number.
Summary of the invention
In view of this, this application provides a kind of web analysis, acquisition methods and device, work as user solving prior art
When triggering the hyperlink in webpage, it is long that browser carries out the response time that domain name mapping caused temporarily, increases user and waits
The problem of time.
To achieve these goals, it is proposed that scheme as follows:
A kind of web analysis method, including:
When the webpage to be shown obtained is resolved, obtain the label of the target identification of described webpage head to be shown,
Described label includes the domain name of the hyperlink with described Webpage correlation to be shown;
The each domain name being comprised described label carries out pre-parsed, obtains the IP address that each domain name is corresponding;
IP address corresponding for each domain name is preserved, in order to the super chain of target in asking described webpage to be shown
During resource corresponding to the domain name that connects, the IP address that the domain name with described target hyperlink that inquiry preserves is corresponding, and based on looking into
The IP address ask carries out the download of resource.
Preferably, when the described webpage to be shown to obtaining resolves, the target of described webpage head to be shown is obtained
The label of mark, including:
When webpage to be shown is resolved, obtain the meta that name value is desired value of described webpage head to be shown
Label.
Preferably, after the label of the target identification of the described webpage head to be shown of described acquisition, the method also includes:
The each domain name comprised by described label is added to domain name pre-parsed queue;
The described each domain name being comprised described label carries out pre-parsed, obtains the IP address that each domain name is corresponding, bag
Include:
Call background thread, each domain name in domain name pre-parsed queue is carried out pre-parsed, obtain each described
The IP address that domain name is corresponding.
Preferably, also include:
When the IP address judging domain name and the correspondence preserved reaches to lose efficacy in limited time, the domain name being up to the inefficacy time limit is added
To domain name pre-parsed queue.
A kind of webpage acquisition methods, based on web analysis method described above, this webpage loading method includes:
Receive the triggering command of target hyperlink in webpage;
In described target hyperlink, extract domain name, and inquire about the domain name and the corresponding relation list of IP address stored,
IP address corresponding to domain name determined and extract;Wherein, domain name and the corresponding relation list of IP address record have, described
Each domain name that the label of the target identification of the webpage head obtained during web analysis is comprised, and to each domain name pre-parsed gained
The corresponding IP address arrived;
According to the IP address corresponding with the domain name extracted determined, access the server that this IP address is corresponding, obtain webpage
Data.
A kind of web analysis device, including:
Domain Name acquisition unit, during for resolving the webpage to be shown obtained, obtains described webpage head to be shown
The label of target identification, described label includes the domain name of the hyperlink with described Webpage correlation to be shown;
Domain name pre-parsed unit, carries out pre-parsed for each domain name being comprised described label, obtains each domain name
Corresponding IP address;
Corresponding relation storage unit, for preserving IP address corresponding for each domain name, in order to described in request
During the resource that in webpage to be shown, the domain name of target hyperlink is corresponding, the domain name with described target hyperlink that inquiry preserves
Corresponding IP address, and the download of resource is carried out based on the IP address inquired.
Preferably, domain name acquiring unit includes:
Meta label acquiring unit, for when resolving webpage to be shown, obtains described webpage head to be shown
The meta label that name value is desired value.
Preferably, also include:
First queue adding device, after the label at the target identification of the described webpage head to be shown of acquisition, will
Each domain name that described label comprises is added to domain name pre-parsed queue;
Domain name pre-parsed unit includes:
Backstage pre-parsed unit, is used for calling background thread, enters each domain name in domain name pre-parsed queue
Row pre-parsed, obtains the IP address that each domain name is corresponding.
Preferably, also include:
Second queue adding device, during for reaching in the IP address judging domain name and the correspondence preserved to lose efficacy in limited time, will
The domain name reaching the inefficacy time limit is added to domain name pre-parsed queue.
A kind of webpage acquisition device, based on web analysis device described above, it is characterised in that this webpage acquisition device
Including:
Triggering command receives unit, for receiving the triggering command of target hyperlink in webpage;
IP address lookup unit, for extracting domain name in described target hyperlink, and inquire about the domain name that stored and
The corresponding relation list of IP address, determines the IP address corresponding with the domain name extracted;Wherein, domain name and IP address corresponding relation
In list, record has, each domain name that the label of the target identification of the webpage head obtained when described web analysis is comprised, with
And to the corresponding IP address obtained by each domain name pre-parsed;
IP address access unit, for according to the IP address corresponding with the domain name extracted determined, accessing this IP address pair
The server answered, obtains web data.
From above-mentioned technical scheme it can be seen that the web analysis method of the embodiment of the present application offer, wait to show to acquisition
Showing that webpage resolves, obtain the label of the target identification of described webpage head to be shown, described label includes to be treated with described
The domain name of the hyperlink of display Webpage correlation;The each domain name being comprised described label carries out pre-parsed, obtains each described territory
The IP address that name is corresponding;IP address corresponding for each domain name is preserved, in order to mesh in asking described webpage to be shown
During the resource corresponding to domain name of mark hyperlink, the IP address that the domain name with described target hyperlink that inquiry preserves is corresponding,
And the download of resource is carried out based on the IP address inquired.It follows that the application webpage development personnel are pre-when developing webpage
First the domain name with the hyperlink of Webpage correlation is write in the label of webpage head, and specify target identification for label, and then
Browser is carrying out resolving to webpage to be shown, obtains the label of webpage head target identification, can get label institute
The each domain name comprised, and then at web analysis concurrent process, each domain name is resolved, obtain IP address corresponding to domain name and protect
Deposit, it is to avoid when the resource that the domain name of certain hyperlink in user's requested webpage is corresponding, carry out domain name resolving institute temporarily
The time loss brought, reduces period of reservation of number.
Accompanying drawing explanation
In order to be illustrated more clearly that the embodiment of the present application or technical scheme of the prior art, below will be to embodiment or existing
In having technology to describe, the required accompanying drawing used is briefly described, it should be apparent that, the accompanying drawing in describing below is only this
The embodiment of application, for those of ordinary skill in the art, on the premise of not paying creative work, it is also possible to according to
The accompanying drawing provided obtains other accompanying drawing.
Fig. 1 is a kind of web analysis method flow diagram disclosed in the embodiment of the present application;
Fig. 2 is another kind of web analysis method flow diagram disclosed in the embodiment of the present application;
Fig. 3 is the embodiment of the present application another web analysis method flow diagram disclosed;
Fig. 4 is a kind of webpage acquisition methods flow chart disclosed in the embodiment of the present application;
Fig. 5 is a kind of web analysis apparatus structure schematic diagram disclosed in the embodiment of the present application;
Fig. 6 is a kind of webpage acquisition device structural representation disclosed in the embodiment of the present application.
Detailed description of the invention
Before introducing the application scheme first to literary composition in the professional term that can mention explain:
1.1IP address
IP address refers to that (English: Internet Protocol Address, is translated into again internet association to Internet protocol address
View address), it is the abbreviation of IP Address.IP address is a kind of unified address format that IP agreement provides, and it is the Internet
On each network and one logical address of each host assignment, shield the difference of physical address with this.
1.2 domain name
Domain name (Domain Name), a certain computer on the Internet being made up of the name of a string separation
Or calculate unit title, for when data transmits mark computer electronic bearing (sometimes referred to as geographical position, geographically
Domain name, refer to administrative autonomy power a local area).The purpose of one domain name is easy for memory and the one group of clothes linked up
The address (website, Email, FTP etc.) of business device.IP address is the numeric type as routing addressing of Internet main frame
Mark, people is not easy memory.Thus create this kind of character type mark of domain name (domain name).
1.3DNS (domain name system)
DNS (Domain Name System, domain name system), that the Internet mutually maps as domain name and IP address
Individual distributed data base, it is possible to make user more easily access the Internet, and do not spend and remember can be directly read by machine
IP number string.By host name, finally give the process of IP address corresponding to this host name and be called domain name mapping (or host name solution
Analysis).DNS Protocol operates on udp protocol, uses port numbers 53.In RFC document, DNS is had specification to illustrate by RFC 2181,
Dynamically updating of DNS is illustrated by RFC 2136, and the inverted cache of DNS query is illustrated by RFC 2308.
1.4DNS server
Dns server refers to preserve in this network the domain name of All hosts and corresponding IP address, and has and domain name turned
It is changed to the server of IP address function.The wherein necessary corresponding IP address of domain name, can there be multiple domain name an IP address, and
Not necessarily there is domain name IP address.Domain name system uses the hierarchical organization of similar directory tree.Name server is usually client computer/clothes
Server side in business device pattern, it mainly has two kinds of forms: master server and forwarding server.Domain name is mapped as IP address
Process be known as " domain name mapping ".
1.5 link
Link refers to Transfer Parameters and control command between each module of computer programs, and they are formed one
The process of individual executable entirety.Link also referred to as hyperlink, refers to point to the annexation of a target, institute from a webpage
The target pointed to can be another webpage, it is also possible to be the diverse location in same web page, it is also possible to be picture, Email
Address, file, even application program.
1.6Meta label
<meta>element can provide the metamessage (meta-information) about the page, such as search engine and
The description of update frequency and key word.<meta>label is positioned at the head of document, does not comprise any content.The genus of<meta>label
Property defines the name/value pair being associated with document.
Below in conjunction with the accompanying drawing in the embodiment of the present application, the technical scheme in the embodiment of the present application is carried out clear, complete
Describe, it is clear that described embodiment is only some embodiments of the present application rather than whole embodiments wholely.Based on
Embodiment in the application, it is every other that those of ordinary skill in the art are obtained under not making creative work premise
Embodiment, broadly falls into the scope of the application protection.
First the application introduces prior art by an instantiation.
When a hyperlink is hit at user's webpage midpoint: http://www.ipanel.cn/index.htm, in chronological sequence
Sequentially, whole handling process approximately as:
1, browser resolves address, obtains domain name www.ipanel.cn;
2, browser connects dns server, sends domain name inquiry request;
3, dns server returns to IP address corresponding to domain name to browser;
4, browser passes through IP address, sets up socket with WEB server and is connected;
5, web browser sends HTTP request, and request header is GET/index.htm HTTP/1.1;
6, WEB server receives request, reads index.htm, content is returned to browser from file system;
7, browser receives the content of pages of index.htm, start to resolve, render, typesetting, drawing, complete the page and show.
8, browser cuts out and connects;
9, server closing connects.
By above-mentioned flow process it can be seen that after hyperlink in user's webpage clicking, browser is just to this hyperlink ground connection
Location carries out domain name mapping, the IP address corresponding by determining domain name alternately with dns server, can access IP address pair afterwards
The server answered.Obviously, domain name resolution process will take certain time so that response time increases, and period of reservation of number adds
Long, affect Consumer's Experience.
To this end, this application provides a kind of web analysis method, it is a kind of disclosed in the embodiment of the present application for seeing Fig. 1, Fig. 1
Web analysis method flow diagram.
As it is shown in figure 1, the method includes:
Step S100, to obtain webpage to be shown resolve, obtain the target identification of described webpage head to be shown
Label, described label includes the domain name of the hyperlink with described Webpage correlation to be shown;
Specifically, webpage development personnel determine the domain name of the hyperlink with Webpage correlation, such as net when developing webpage
Hyperlink included in Ye and/or the domain name of the hyperlink included in two grades of pages of webpage.Determining domain name
Afterwards, domain name is write in the label of webpage head, and identify for label target setting, in order to browser identification.
It is understood that the number of the domain name comprised in label does not limit, can be one or more, concrete view
Depending on Ye.
Based on this, browser to obtain webpage to be shown resolve time, the first head of analyzing web page, obtain mesh
The label of mark mark, and then each domain name included in label can be obtained.
Step S110, each domain name being comprised described label carry out pre-parsed, obtain the IP ground that each domain name is corresponding
Location;
Specifically, after resolving acquisition domain name, each domain name is carried out pre-parsed, obtain the IP ground that each domain name is corresponding
Location.
It is understood that the process to domain name pre-parsed can be to perform, also with web analysis course synchronization to be shown
I.e. while resolving webpage to be shown, the domain name obtained is carried out pre-parsed, obtain corresponding IP address.Certainly, due to target
The label of mark is positioned at webpage head, and web analysis process is usually according to from first to last order, therefore can first resolve
Label to target identification.At this point it is possible to each domain name directly comprised label carries out pre-parsed, complete without webpage by the time
Portion is parsed.
Optionally, the mode of domain name pre-parsed is it may be that send domain name to dns server, by dns server inquiry field
The IP address that name is corresponding, and then return to browser.
Step S120, IP address corresponding for each domain name is preserved, in order in asking described webpage to be shown
During resource corresponding to the domain name of target hyperlink, the IP ground that the domain name with described target hyperlink that inquiry preserves is corresponding
Location, and the download of resource is carried out based on the IP address inquired.
Specifically, after pre-parsed gets the IP address that domain name is corresponding, the IP address of domain name and correspondence is carried out
Preserve.By preserving domain name and the IP address of correspondence, subsequent user is the domain name of target hyperlink in asking webpage to be shown
During corresponding resource, can directly inquiry in the domain name preserved with IP address corresponding relation, it is to avoid domain name is solved temporarily
Analysis, thus improve webpage opening speed.
The web analysis method that the embodiment of the present application provides, resolves the webpage to be shown obtained, treats described in acquisition
The label of the target identification of display webpage head, described label includes the territory of the hyperlink with described Webpage correlation to be shown
Name;The each domain name being comprised described label carries out pre-parsed, obtains the IP address that each domain name is corresponding;By each domain name
Corresponding IP address preserves, in order to the resource that the domain name of target hyperlink is corresponding in asking described webpage to be shown
Time, inquire about the IP address that the domain name with described target hyperlink preserved is corresponding, and provide based on the IP address inquired
The download in source.It follows that the application webpage development personnel develop webpage time in advance by with the hyperlink of Webpage correlation
In the label of domain name write webpage head, and specify target identification for label, and then webpage to be shown is being solved by browser
Analysis process, obtains the label of webpage head target identification, can get each domain name that label is comprised, and then at web analysis
Each domain name is resolved by concurrent process, obtain IP address corresponding to domain name and preserve, it is to avoid when in user's requested webpage certain
During resource corresponding to the domain name of individual hyperlink, domain name is resolved brought time loss temporarily, reduces user etc.
Treat the time.
Seeing Fig. 2, Fig. 2 is another kind of web analysis method flow diagram disclosed in the embodiment of the present application.
Step S200, resolving webpage to be shown, the name value obtaining described webpage head to be shown is desired value
Meta label, described meta label includes the domain name of the hyperlink with described Webpage correlation to be shown;
Specifically, webpage development personnel determine the domain name of the hyperlink with Webpage correlation, such as net when developing webpage
Hyperlink included in Ye and/or the domain name of the hyperlink included in two grades of pages of webpage.Determining domain name
Afterwards, domain name is write in the meta label of webpage head, and the name value of meta label is set as desired value, in order to browse
Device identification.
Illustrate such as:<meta name=" dns " content=www.example.com, example2.com/>
Wherein, the value of this meta label is dns.Meta label comprises altogether two domain names, is respectively as follows:
Www.example.com and www.example2.com.Wherein, different domain names are in the content attribute of meta label, with solid
Determining separator to separate, separator described above is comma, ".
Step S210, each domain name being comprised described label carry out pre-parsed, obtain the IP ground that each domain name is corresponding
Location;
Specifically, after resolving acquisition domain name, each domain name is carried out pre-parsed, obtain the IP ground that each domain name is corresponding
Location.
It is understood that the process to domain name pre-parsed can be to perform, also with web analysis course synchronization to be shown
I.e. while resolving webpage to be shown, the domain name obtained is carried out pre-parsed, obtain corresponding IP address.Certainly, due to target
The label of mark is positioned at webpage head, and web analysis process is usually according to from first to last order, therefore can first resolve
Label to target identification.At this point it is possible to each domain name directly comprised label carries out pre-parsed, complete without webpage by the time
Portion is parsed.
Step S220, IP address corresponding for each domain name is preserved, in order in asking described webpage to be shown
During resource corresponding to the domain name of target hyperlink, the IP ground that the domain name with described target hyperlink that inquiry preserves is corresponding
Location, and the download of resource is carried out based on the IP address inquired.
The present embodiment describes the specific implementation obtaining the domain name that webpage to be shown is comprised, namely analyzing web page
Obtain the meta label that name value is desired value of webpage head, and then obtain each domain name that this label is comprised.
Seeing Fig. 3, Fig. 3 is the embodiment of the present application another web analysis method flow diagram disclosed.
As it is shown on figure 3, the method includes:
Step S300, resolving webpage to be shown, the name value obtaining described webpage head to be shown is desired value
Meta label, described meta label includes the domain name of the hyperlink with described Webpage correlation to be shown;
Specifically, webpage development personnel determine the domain name of the hyperlink with Webpage correlation, such as net when developing webpage
Hyperlink included in Ye and/or the domain name of the hyperlink included in two grades of pages of webpage.Determining domain name
Afterwards, domain name is write in the meta label of webpage head, and the name value of meta label is set as desired value, in order to browse
Device identification.
Step S310, each domain name comprised by described label are added to domain name pre-parsed queue;
Specifically, the application can pre-set a domain name pre-parsed queue.For each domain name comprised in label, add
Add in this domain name pre-parsed queue.
Step S320, call background thread, each domain name in domain name pre-parsed queue is carried out pre-parsed, obtains
Take the IP address that each domain name is corresponding;
Specifically, can arrange a thread in this step on backstage, this thread is specifically designed to domain name pre-parsed queue
In domain name resolve.The thread work time can be Tong Bu execution with the time of browser resolves webpage.
Step S330, IP address corresponding for each domain name is preserved, in order in asking described webpage to be shown
During resource corresponding to the domain name of target hyperlink, the IP ground that the domain name with described target hyperlink that inquiry preserves is corresponding
Location, and the download of resource is carried out based on the IP address inquired.
In the present embodiment, describe the domain name obtained by queue form storing and resolving, and call background thread to queue
In domain name resolve, the process of process and browser resolves webpage that thread resolves domain name can be parallel.
Optionally, on the basis of the various embodiments described above, the application can also be to the domain name preserved and the IP address of correspondence
The inefficacy time limit is set.Judge to prescribe a time limit when reaching to lose efficacy in the IP address of domain name and the correspondence preserved in detection, be up to the time limit of losing efficacy
Domain name add in domain name pre-parsed queue.Added to domain name pre-parsed by the domain name being up to the inefficacy time limit
In queue, background thread this domain name is re-started parsing, obtain corresponding up-to-date IP address, and set up this up-to-date IP
Relation between address with corresponding domain name.
Wherein, domain name can store as follows with IP corresponding relation:
Domain name (key) | IP address (value) |
www.example1.com | xxx.xxx.xxx.xxx |
www.example2.com | xxx.xxx.xxx.xxx |
Table 1
Web analysis methods based on the various embodiments described above, the embodiment of the present application further provides a kind of webpage acquisition side
Method, after i.e. in user is to webpage, certain hyperlink triggers, browser obtains the processing procedure of corresponding webpage, sees figure
4, Fig. 4 is a kind of webpage acquisition methods flow chart disclosed in the embodiment of the present application.
As shown in Figure 4, the method includes:
Step S400, reception are to the triggering command of target hyperlink in webpage;
Specifically, often carrying hyperlink in the webpage that user browses, user can be to target to be browsed
Hyperlink triggers, and such as click etc., browser receives the triggering command of user.
Step S410, in described target hyperlink, extract domain name, and inquire about the domain name that stored and IP address is corresponding
Relation list, determines the IP address corresponding with the domain name extracted;
Wherein, wherein, domain name and the corresponding relation list of IP address record and has, obtain when described web analysis
Each domain name that the label of the target identification of webpage head is comprised, and to the corresponding IP ground obtained by each domain name pre-parsed
Location.Acquisition mode for domain name and IP address corresponding relation is referred to the introduction of the various embodiments described above, and here is omitted.
In this step, extraction domain name from target hyperlink, and inquire about the corresponding relation of storage, the territory determining with extracting
The IP address that name is corresponding.
Step S420, according to determine with IP address corresponding to domain name extracted, access the server that this IP address is corresponding,
Obtain web data.
Specifically, it is referred to prior art according to the process of IP address acquisition web data.
The domain name of the hyperlink included webpage the most in advance due to the application resolves, and has obtained the IP of correspondence
Address.Therefore, when user triggers target hyperlink, can eliminate territory directly in the IP address that local search is corresponding
Name carries out the link resolved, and accelerates the page and is loaded into the time, reduces period of reservation of number.
Below to the embodiment of the present application provide web analysis device be described, web analysis device described below with
Above-described web analysis method can be mutually to should refer to.
Wherein, the undocumented details of device item is referred to the introduction of method item embodiment.
Seeing Fig. 5, Fig. 5 is a kind of web analysis apparatus structure schematic diagram disclosed in the embodiment of the present application.
As it is shown in figure 5, this device includes:
Domain Name acquisition unit 51, during for resolving the webpage to be shown obtained, obtains described webpage head to be shown
The label of the target identification in portion, described label includes the domain name of the hyperlink with described Webpage correlation to be shown;
Domain name pre-parsed unit 52, carries out pre-parsed for each domain name being comprised described label, obtains each described territory
The IP address that name is corresponding;
Corresponding relation storage unit 53, for preserving IP address corresponding for each domain name, in order in request institute
When stating the resource that in webpage to be shown, the domain name of target hyperlink is corresponding, the territory with described target hyperlink that inquiry preserves
The IP address that name is corresponding, and the download of resource is carried out based on the IP address inquired.
The web analysis device that the embodiment of the present application provides, resolves the webpage to be shown obtained, treats described in acquisition
The label of the target identification of display webpage head, described label includes the territory of the hyperlink with described Webpage correlation to be shown
Name;The each domain name being comprised described label carries out pre-parsed, obtains the IP address that each domain name is corresponding;By each domain name
Corresponding IP address preserves, in order to the resource that the domain name of target hyperlink is corresponding in asking described webpage to be shown
Time, inquire about the IP address that the domain name with described target hyperlink preserved is corresponding, and provide based on the IP address inquired
The download in source.It follows that the application webpage development personnel develop webpage time in advance by with the hyperlink of Webpage correlation
In the label of domain name write webpage head, and specify target identification for label, and then webpage to be shown is being solved by browser
Analysis process, obtains the label of webpage head target identification, can get each domain name that label is comprised, and then at web analysis
Each domain name is resolved by concurrent process, obtain IP address corresponding to domain name and preserve, it is to avoid when in user's requested webpage certain
During resource corresponding to the domain name of individual hyperlink, domain name is resolved brought time loss temporarily, reduces user etc.
Treat the time.
Optionally, domain name acquiring unit may include that
Meta label acquiring unit, for resolving webpage to be shown, obtains described webpage head to be shown
Name value is the meta label of desired value.
Optionally, the device of the application can also include:
First queue adding device, after the label at the target identification of the described webpage head to be shown of acquisition, will
Each domain name that described label comprises is added to domain name pre-parsed queue.
Based on this, domain name pre-parsed unit may include that
Backstage pre-parsed unit, is used for calling background thread, enters each domain name in domain name pre-parsed queue
Row pre-parsed, obtains the IP address that each domain name is corresponding.
Optionally, the device of the application can also include:
Second queue adding device, during for reaching in the IP address judging domain name and the correspondence preserved to lose efficacy in limited time, will
The domain name reaching the inefficacy time limit is added to domain name pre-parsed queue.
Further, the webpage acquisition device providing the embodiment of the present application is described, and webpage described below obtains dress
Putting can be mutually to should refer to above-described webpage acquisition methods.
Webpage acquisition device disclosed in the present application web analysis based on above-described embodiment device, seeing Fig. 6, Fig. 6 is this
A kind of webpage acquisition device structural representation disclosed in application embodiment.
As shown in Figure 6, this device includes:
Triggering command receives unit 61, for receiving the triggering command of target hyperlink in webpage;
IP address lookup unit 62, for extracting domain name in described target hyperlink, and inquires about the domain name stored
And the corresponding relation list of IP address, determine the IP address corresponding with the domain name extracted;Wherein, domain name and IP address correspondence are closed
In series of tables, record has, each domain name that the label of the target identification of the webpage head obtained when described web analysis is comprised,
And to the corresponding IP address obtained by each domain name pre-parsed;
IP address access unit 63, for according to the IP address corresponding with the domain name extracted determined, accessing this IP address
Corresponding server, obtains web data.
The domain name of the hyperlink included webpage the most in advance due to the application resolves, and has obtained the IP of correspondence
Address.Therefore, when user triggers target hyperlink, can eliminate territory directly in the IP address that local search is corresponding
Name carries out the link resolved, and accelerates the page and is loaded into the time, reduces period of reservation of number.
Finally, in addition it is also necessary to explanation, in this article, the relational terms of such as first and second or the like be used merely to by
One entity or operation separate with another entity or operating space, and not necessarily require or imply these entities or operation
Between exist any this reality relation or order.And, term " includes ", " comprising " or its any other variant meaning
Containing comprising of nonexcludability, so that include that the process of a series of key element, method, article or equipment not only include that
A little key elements, but also include other key elements being not expressly set out, or also include for this process, method, article or
The key element that equipment is intrinsic.In the case of there is no more restriction, statement " including ... " key element limited, do not arrange
Except there is also other identical element in including the process of described key element, method, article or equipment.
In this specification, each embodiment uses the mode gone forward one by one to describe, and what each embodiment stressed is and other
The difference of embodiment, between each embodiment, identical similar portion sees mutually.
Described above to the disclosed embodiments, makes professional and technical personnel in the field be capable of or uses the application.
Multiple amendment to these embodiments will be apparent from for those skilled in the art, as defined herein
General Principle can realize in the case of without departing from spirit herein or scope in other embodiments.Therefore, the application
It is not intended to be limited to the embodiments shown herein, and is to fit to and principles disclosed herein and features of novelty phase one
The widest scope caused.
Claims (10)
1. a web analysis method, it is characterised in that including:
The webpage to be shown obtained is resolved, obtains the label of the target identification of described webpage head to be shown, described mark
Sign the domain name including the hyperlink with described Webpage correlation to be shown;
The each domain name being comprised described label carries out pre-parsed, obtains the IP address that each domain name is corresponding;
IP address corresponding for each domain name is preserved, in order to target hyperlink in asking described webpage to be shown
During resource corresponding to domain name, the IP address that the domain name with described target hyperlink that inquiry preserves is corresponding, and based on inquiring
IP address carry out the download of resource.
Web analysis method the most according to claim 1, it is characterised in that the described webpage to be shown to obtaining solves
Analysis, obtains the label of the target identification of described webpage head to be shown, including:
Webpage to be shown is resolved, obtains the meta label that name value is desired value of described webpage head to be shown.
Web analysis method the most according to claim 1 and 2, it is characterised in that at the described webpage to be shown of described acquisition
After the label of the target identification of head, the method also includes:
The each domain name comprised by described label is added to domain name pre-parsed queue;
The described each domain name being comprised described label carries out pre-parsed, obtains the IP address that each domain name is corresponding, including:
Call background thread, each domain name in domain name pre-parsed queue is carried out pre-parsed, obtains each domain name
Corresponding IP address.
Web analysis method the most according to claim 3, it is characterised in that also include:
When the IP address judging domain name and the correspondence preserved reaches to lose efficacy in limited time, the domain name being up to the inefficacy time limit is added to institute
State in domain name pre-parsed queue.
5. a webpage acquisition methods, it is characterised in that based on the web analysis method described in any one of claim 1-4, should
Webpage loading method includes:
Receive the triggering command of target hyperlink in webpage;
In described target hyperlink, extract domain name, and inquire about the domain name and the corresponding relation list of IP address stored, determine
The IP address corresponding with the domain name extracted;Wherein, domain name and the corresponding relation list of IP address record have, at described webpage
Each domain name that the label of target identification of the webpage head obtained during parsing is comprised, and to each domain name pre-parsed obtained by
Corresponding IP address;
According to the IP address corresponding with the domain name extracted determined, access the server that this IP address is corresponding, obtain web data.
6. a web analysis device, it is characterised in that including:
Domain Name acquisition unit, during for resolving the webpage to be shown obtained, obtains the mesh of described webpage head to be shown
The label of mark mark, described label includes the domain name of the hyperlink with described Webpage correlation to be shown;
Domain name pre-parsed unit, carries out pre-parsed for each domain name being comprised described label, obtains each domain name corresponding
IP address;
Corresponding relation storage unit, for preserving IP address corresponding for each domain name, in order to is waiting to show described in request
When showing the resource that in webpage, the domain name of target hyperlink is corresponding, the domain name with described target hyperlink that inquiry preserves is corresponding
IP address, and carry out the download of resource based on the IP address inquired.
Web analysis device the most according to claim 6, it is characterised in that domain name acquiring unit includes:
Meta label acquiring unit, for resolving webpage to be shown, obtains the name value of described webpage head to be shown
Meta label for desired value.
8. according to the web analysis device described in claim 6 or 7, it is characterised in that also include:
First queue adding device, for after obtaining the label of target identification of described webpage head to be shown, by described
Each domain name that label comprises is added to domain name pre-parsed queue;
Domain name pre-parsed unit includes:
Backstage pre-parsed unit, is used for calling background thread, carries out pre-to each domain name in domain name pre-parsed queue
Resolve, obtain the IP address that each domain name is corresponding.
Web analysis device the most according to claim 8, it is characterised in that also include:
Second queue adding device, during for reaching in the IP address judging domain name and the correspondence preserved to lose efficacy in limited time, is up to
The domain name in inefficacy time limit is added to domain name pre-parsed queue.
10. a webpage acquisition device, it is characterised in that based on the web analysis device described in any one of claim 6-9, its
Being characterised by, this webpage acquisition device includes:
Triggering command receives unit, for receiving the triggering command of target hyperlink in webpage;
IP address lookup unit, for extracting domain name in described target hyperlink, and inquires about the domain name stored and IP ground
Location corresponding relation list, determines the IP address corresponding with the domain name extracted;Wherein, domain name and the corresponding relation list of IP address
Middle record has, each domain name that the label of the target identification of the webpage head obtained when described web analysis is comprised, and right
Corresponding IP address obtained by each domain name pre-parsed;
IP address access unit, for according to the IP address corresponding with the domain name extracted determined, accessing this IP address corresponding
Server, obtains web data.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610700420.XA CN106294848A (en) | 2016-08-22 | 2016-08-22 | A kind of web analysis, acquisition methods and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610700420.XA CN106294848A (en) | 2016-08-22 | 2016-08-22 | A kind of web analysis, acquisition methods and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106294848A true CN106294848A (en) | 2017-01-04 |
Family
ID=57661888
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610700420.XA Pending CN106294848A (en) | 2016-08-22 | 2016-08-22 | A kind of web analysis, acquisition methods and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106294848A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107798061A (en) * | 2017-09-18 | 2018-03-13 | 维沃移动通信有限公司 | A kind of webpage loading method and mobile terminal |
CN107835267A (en) * | 2017-11-15 | 2018-03-23 | 维沃移动通信有限公司 | Domain name analytic method and device |
CN110913027A (en) * | 2018-09-14 | 2020-03-24 | 北京微播视界科技有限公司 | Domain name resolution method and device |
CN108536603B (en) * | 2018-04-16 | 2021-03-02 | 哈尔滨工业大学 | Automatic testing method for Web browser behaviors aiming at new top-level domain name |
CN116185497A (en) * | 2023-01-06 | 2023-05-30 | 格兰菲智能科技有限公司 | Command analysis method, device, computer equipment and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6643641B1 (en) * | 2000-04-27 | 2003-11-04 | Russell Snyder | Web search engine with graphic snapshots |
CN102882991A (en) * | 2012-09-29 | 2013-01-16 | 北京奇虎科技有限公司 | Browser and domain name resolution method thereof |
CN103685604A (en) * | 2013-12-20 | 2014-03-26 | 北京奇虎科技有限公司 | Domain name pre-resolution method and domain name pre-resolution device |
CN104135546A (en) * | 2014-07-25 | 2014-11-05 | 可牛网络技术(北京)有限公司 | Method for loading webpage and terminal |
-
2016
- 2016-08-22 CN CN201610700420.XA patent/CN106294848A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6643641B1 (en) * | 2000-04-27 | 2003-11-04 | Russell Snyder | Web search engine with graphic snapshots |
CN102882991A (en) * | 2012-09-29 | 2013-01-16 | 北京奇虎科技有限公司 | Browser and domain name resolution method thereof |
CN103685604A (en) * | 2013-12-20 | 2014-03-26 | 北京奇虎科技有限公司 | Domain name pre-resolution method and domain name pre-resolution device |
CN104135546A (en) * | 2014-07-25 | 2014-11-05 | 可牛网络技术(北京)有限公司 | Method for loading webpage and terminal |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107798061A (en) * | 2017-09-18 | 2018-03-13 | 维沃移动通信有限公司 | A kind of webpage loading method and mobile terminal |
CN107835267A (en) * | 2017-11-15 | 2018-03-23 | 维沃移动通信有限公司 | Domain name analytic method and device |
CN108536603B (en) * | 2018-04-16 | 2021-03-02 | 哈尔滨工业大学 | Automatic testing method for Web browser behaviors aiming at new top-level domain name |
CN110913027A (en) * | 2018-09-14 | 2020-03-24 | 北京微播视界科技有限公司 | Domain name resolution method and device |
CN116185497A (en) * | 2023-01-06 | 2023-05-30 | 格兰菲智能科技有限公司 | Command analysis method, device, computer equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN100424694C (en) | Implementing method of network profile | |
CN106294848A (en) | A kind of web analysis, acquisition methods and device | |
CN103389983B (en) | A kind of capturing webpage contents method and device for network crawler system | |
CN102882991B (en) | A kind of browser and carry out the method for domain name mapping | |
CN106301928A (en) | A kind of web analysis, acquisition methods and device | |
CN102075570B (en) | Method for implementing HTTP (hyper text transport protocol) message caching mechanism based on keywords | |
CN102968341B (en) | The method and apparatus of different editions IE kernels based on many kernel browsers switching | |
CN103685604B (en) | A kind of domain name pre-parsed method and device | |
EP2002362B1 (en) | Method and system for providing improved url mangling performance using fast re-write | |
JPH11195025A (en) | Linking device for document data, display and access device for link destination address and distribution device for linked document data | |
US20150100563A1 (en) | Method for retaining search engine optimization in a transferred website | |
CN105528452A (en) | Method and system for loading page data | |
US20140188871A1 (en) | Tld markup language | |
CN108702396A (en) | For the method for data processing, equipment and computer program and hierarchical domain name system area file | |
US7970758B2 (en) | Automatic completion with LDAP | |
CN102968448B (en) | A kind of browser | |
CN104065736B (en) | A kind of URL reorientation methods, apparatus and system | |
CN107070988A (en) | Message processing method and device | |
CN101551813A (en) | Network connection apparatus, search equipment and method for collecting search engine data source | |
CN103617225B (en) | A kind of associating web pages searching method and system | |
CN104253796B (en) | Quick area's recognition methods based on network address binding region layer level in domain name system | |
CN1960371B (en) | Method and system for accessing file of Web application program | |
JP3889667B2 (en) | Computer network connection method on the Internet by real name and computer network system thereof | |
CN107229653A (en) | Pseudo- static Web page generation method and device | |
CN101231655A (en) | Method and system for processing search engine results |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170104 |
|
RJ01 | Rejection of invention patent application after publication |