CN108280094A - Using upper and lower line number method and device according to statistics - Google Patents

Using upper and lower line number method and device according to statistics Download PDF

Info

Publication number
CN108280094A
CN108280094A CN201710010785.4A CN201710010785A CN108280094A CN 108280094 A CN108280094 A CN 108280094A CN 201710010785 A CN201710010785 A CN 201710010785A CN 108280094 A CN108280094 A CN 108280094A
Authority
CN
China
Prior art keywords
application
address
inquiry
offline
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710010785.4A
Other languages
Chinese (zh)
Other versions
CN108280094B (en
Inventor
王洪岭
康明吉
秦娇
路博
王跃
乔亲旺
于慧文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Taier Zhixin Technology Co Ltd
Original Assignee
Guangzhou Taier Zhixin Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Taier Zhixin Technology Co Ltd filed Critical Guangzhou Taier Zhixin Technology Co Ltd
Priority to CN201710010785.4A priority Critical patent/CN108280094B/en
Publication of CN108280094A publication Critical patent/CN108280094A/en
Application granted granted Critical
Publication of CN108280094B publication Critical patent/CN108280094B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/954Navigation, e.g. using categorised browsing

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Information Transfer Between Computers (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention belongs to apply data statistics technical field, and in particular to a kind of upper and lower line number of application according to statistics method and device, it can be achieved that judging using whether online and the case where using upper and lower line.The upper and lower line data statistical approach of application provided by the invention, including:It is accessed using crawler technology to having the application address in address date table;The inquiry state that server returns is obtained, according to online application and offline application in the inquiry statistic current slot, deletes application offline in described address tables of data.It is provided by the invention to apply upper and lower line number method and device according to statistics, repeat to crawl application address in address date table using crawler technology, in statistics application shop in a period of time (such as certain day, certain week, certain moon) apply it is online, reach the standard grade, offline situation.

Description

Using upper and lower line number method and device according to statistics
Technical field
The present invention relates to apply data statistics technical field, and in particular to a kind of upper and lower line data statistical approach of application and Device.
Background technology
Mobile applications monitor, and mainly crawl the specifying information of application shop, using detailed letter by crawler technology Breath, the download etc. each applied, count the application in application market, are provided reliably for Industry support, decision Information.Since the update of each application is very frequent, there is new application to reach the standard grade daily, also have largely apply it is offline, answer Version can also be constantly updated, and existing applied statistical method is all the data of cumulative statistics, therefore, it is impossible to under The application of the application of line or more new version is counted, can not know current online application also how many, it is even more impossible to be informed in Reach the standard grade in certain time/offline application how many.
Invention content
For the defects in the prior art, the upper and lower line number of application provided by the invention method and device according to statistics, using climbing Worm technology repeats to crawl application address in address date table, applied in a period of time in statistics application shop it is online, reach the standard grade, Offline situation.
In a first aspect, a kind of upper and lower line data statistical approach of application provided by the invention, including:Utilize crawler technology pair The application address having in address date table accesses;The inquiry state that server returns is obtained, according to the inquiry state Online application and offline application in current slot are counted, application offline in described address tables of data is deleted.
It is preferably, described according to online application and offline application in the inquiry statistic current slot, Including:If the state of inquiry is to access to fail, the application address for accessing failure is put into newly-built wrong data table;If inquiry State is to redirect, then is put into the web page address after redirecting in newly-built wrong data table;After having traversed described address tables of data, The newly-built wrong data table of traversal accesses failure for inquiry state or redirects during traversing wrong data table Situation then continues to establish the application address that new wrong data table storage accesses failure or redirects, until reaching preset condition, if Also access the application address of failure, then it is assumed that application is offline, and offline application address is moved into offline tables of data.
Preferably, the preset condition is that traversal number reaches frequency threshold value or traversal time reaches time threshold.
Preferably, further include:If the inquiry state is to redirect, and can be crawled and be answered by the web page address after redirecting The destination address redirected is then added in described address tables of data by information.
Preferably, further include:If the inquiry state is successfully, the message that the server returns is parsed, according to report Literary content judges whether the version of application updates, the version updating situation applied in statistics current slot.
Second aspect, a kind of upper and lower line number of application provided by the invention device according to statistics, including:Data crawl module, use In using crawler technology to have address date table in application address access;Applied statistics module, for obtaining service The inquiry state that device returns is deleted according to online application and offline application in the inquiry statistic current slot Except application offline in described address tables of data.
Preferably, the applied statistics module is specifically used for:If the state of inquiry is to access to fail, answering for failure will be accessed It is put into address in newly-built wrong data table;If inquiry state is to redirect, the web page address after redirecting is put into newly-built In wrong data table;After having traversed described address tables of data, newly-built wrong data table is traversed, in the mistake of traversal wrong data table Cheng Zhong is then to continue to establish new wrong data table storage access failure the case where accessing failure or redirect for inquiry state Or the application address redirected, until reaching preset condition, if there is the application address for accessing failure, then it is assumed that application is offline, Offline application address is moved into offline tables of data.
Preferably, the preset condition is that traversal number reaches frequency threshold value or traversal time reaches time threshold.
Preferably, the applied statistics module is additionally operable to:If the inquiry state is to redirect, and passes through the webpage after redirecting Address can crawl the information of application, then the destination address redirected is added in described address tables of data.
Preferably, the applied statistics module is additionally operable to:If the inquiry state is successfully, parses the server and return The message returned judges whether the version of application updates according to message content, the version updating feelings applied in statistics current slot Condition.
Description of the drawings
The flow chart for the upper and lower line data statistical approach of application that Fig. 1 is provided by the embodiment of the present invention;
The structure diagram of the upper and lower line number of application that Fig. 2 is provided by embodiment of the present invention device according to statistics.
Specific implementation mode
The embodiment of technical solution of the present invention is described in detail below in conjunction with attached drawing.Following embodiment is only used for Clearly illustrate technical scheme of the present invention, therefore be intended only as example, and the protection of the present invention cannot be limited with this Range.
It should be noted that unless otherwise indicated, technical term or scientific terminology used in this application should be this hair The ordinary meaning that bright one of ordinary skill in the art are understood.
As shown in Figure 1, a kind of upper and lower line data statistical approach of application is present embodiments provided, including:
Step S1 is accessed using crawler technology to having the application address in address date table.
Wherein, address date table is used for storing the application address of application on site.Application address is with referring to the webpage where application Location, that is, the URL applied.
Step S2 obtains the inquiry state that server returns, and is answered according to online in inquiry statistic current slot With with offline application, delete offline application in address date table.
Wherein, inquiry state is that the return code returned according to server obtains, and table 1 gives part return code, this reality It applies example and induction-arrangement has been carried out to return code according to the meaning of return code:Return code is that " 200 " then the state of inquiring is successfully, to indicate Using online;Return code is " 302 ", " 303 " etc., and inquiry state is to redirect, and indicates that application may be also online, but application address is Through changing;Return code is " 400 ", " 401 " etc., then it is to access to fail to inquire state;Return code is that " 304 " indicate no change Change, does not deal with;For other return codes, it may be possible to be likely to be the networks such as network timeout, server timeout, packet loss and ask Caused by topic, it is also possible to be that application is offline, then it is to access to fail to inquire state.
Table 1
The upper and lower line data statistical approach of application provided in this embodiment repeats to crawl address date table using crawler technology In application address, in statistics application shop in a period of time (such as certain day, certain week, certain moon) application it is online, reach the standard grade, be offline Situation.
During being crawled again using data, the problems such as network timeout, congestion, failure, it can all lead to not crawl To the information of application.It is to carry out repeating to crawl to error number generally directed to the solution for occurring wrong data during crawling, But network problem is difficult to solve in a short time, since the time interval for repeating to crawl is very short, the data crawled again are still Wrong data, this mode, which reduces, crawls efficiency, it is also possible to aggravate network, server congestion degree.
In order to improve the efficiency for crawling application message, the accuracy of statistical data is improved, the preferred embodiment of step S2 includes:
The application address for accessing failure is put into newly-built wrong data by step S21 if inquiry state is to access to fail In table;If inquiry state is to redirect, the web page address after redirecting is put into newly-built wrong data table.
Such as:If the inquiry code that server returns is 400,401,404,410,5## (being specifically shown in Table 1) when, it is corresponding Inquiry state be access fail, then corresponding application address is put into newly-built wrong data table, so as to access failure Application address is crawled again, avoids being that can not obtain data caused by network;If the inquiry code that server returns is 3## When (being specifically shown in Table 1), the web page address after redirecting (applying new URL) is put into newly-built wrong data table, so as to address The application to change carries out data and crawls.Wherein, if the inquiry code that server returns is 408, server waiting request is indicated Time-out is then climbed again at once.
Step S22 after having traversed address date table, traverses newly-built wrong data table;In the mistake of traversal wrong data table Cheng Zhong is to continue to establish new wrong data table storage access failure or jump the case where accessing failure or redirect to inquiry state The application address turned;Repetition establishes new wrong data table until reaching preset condition, if there is the application address for accessing failure, Then think that application is offline, offline application address is moved into offline tables of data.
Wherein, preset condition is that traversal number reaches frequency threshold value or traversal time reaches time threshold, and preset condition is Occurs the inconclusible situation of ergodic process in order to prevent.
Step S23 deletes application offline in address date table, and current time is obtained according to statistics in offline tables of data The offline situation applied in section, the online situation for obtaining applying in current slot according to address tables of data.
The preferred embodiment of step S2 creates new wrong data table and accommodates access always after having traversed wrong data table The application address of failure, traversal is executed for new wrong data table.On the one hand, the problems such as avoiding because of network, server congestion Cause that application address can not be crawled, leads to statistical data mistake;On the other hand, it avoids that same address is repeated to climb in the short time It takes, cause to get always is wrong data, is conducive to raising and crawls efficiency.When application address changes, server The page of meeting return jump, it is to redirect to inquire state at this time, if can crawl the letter of application by the web page address after redirecting The destination address redirected, then be added in address date table by breath, with the application address of update application, facilitates later statistics.
The version of application updates often, for the version updating situation of statistics application, when the state of inquiry is successfully, parsing The message that server returns judges whether the version of application updates according to message content, the version applied in statistics current slot This update status.
Based on inventive concept identical with the upper and lower line data statistical approach of above application, the present embodiment additionally provides one kind Using upper and lower line number device according to statistics, as shown in Fig. 2, including:Data crawl module, for utilizing crawler technology to having ground Application address in the tables of data of location accesses;Applied statistics module, the inquiry state for obtaining server return, according to looking into Online application and offline application in statistic current slot are ask, application offline in address date table is deleted.
It is provided in this embodiment to apply upper and lower line number device according to statistics, it repeats to crawl address date table using crawler technology In application address, in statistics application shop in a period of time (such as certain day, certain week, certain moon) application it is online, reach the standard grade, be offline Situation.
Further, applied statistics module is specifically used for:If the state of inquiry is to access to fail, the application of failure will be accessed Address is put into newly-built wrong data table;If inquiry state is to redirect, the web page address after redirecting is put into newly-built mistake Accidentally in tables of data;After having traversed address date table, newly-built wrong data table is traversed, during traversing wrong data table, It is then to continue to establish new wrong data table storage access failure or redirect the case where accessing failure or redirect for inquiry state Application address, until reaching preset condition, if also have access failure application address, then it is assumed that application is offline, under The application address of line moves into offline tables of data.
Wherein, preset condition is that traversal number reaches frequency threshold value or traversal time reaches time threshold.
Wherein, applied statistics module is additionally operable to:If inquiry state is to redirect, and can be climbed by the web page address after redirecting The information of application is got, then the destination address redirected is added in address date table.
Wherein, applied statistics module is additionally operable to:If inquiry state is the successfully message that resolution server returns, according to Message content judges whether the version of application updates, the version updating situation applied in statistics current slot.
Finally it should be noted that:The above embodiments are only used to illustrate the technical solution of the present invention., rather than its limitations;To the greatest extent Present invention has been described in detail with reference to the aforementioned embodiments for pipe, it will be understood by those of ordinary skill in the art that:Its according to So can with technical scheme described in the above embodiments is modified, either to which part or all technical features into Row equivalent replacement;And these modifications or replacements, various embodiments of the present invention technology that it does not separate the essence of the corresponding technical solution The range of scheme should all cover in the claim of the present invention and the range of specification.

Claims (10)

1. a kind of upper and lower line data statistical approach of application, which is characterized in that including:
It is accessed using crawler technology to having the application address in address date table;
Obtain the inquiry state that server returns, according to online application in the inquiry statistic current slot and under Application offline in described address tables of data is deleted in the application of line.
2. according to the method described in claim 1, it is characterized in that, described according in the inquiry statistic current slot Online application and offline application, including:
If the state of inquiry is to access to fail, the application address for accessing failure is put into newly-built wrong data table;If inquiry State is to redirect, then is put into the web page address after redirecting in newly-built wrong data table;
After having traversed described address tables of data, newly-built wrong data table is traversed, during traversing wrong data table, for Inquiry state is to continue to establish the application that new wrong data table storage accesses failure or redirects the case where accessing failure or redirect Address, until reaching preset condition, if there is the application address for accessing failure, then it is assumed that application is offline, and offline is answered It is moved into offline tables of data with address.
3. according to the method described in claim 2, it is characterized in that, the preset condition be traversal number reach frequency threshold value or Traversal time reaches time threshold.
4. according to the method described in claim 1, it is characterized in that, further including:If the inquiry state is to redirect, and passes through jump Web page address after turning can crawl the information of application, then the destination address redirected is added in described address tables of data.
5. according to the method described in claim 1, it is characterized in that, further including:If the inquiry state is successfully, to parse institute The message for stating server return judges whether the version of application updates according to message content, is applied in statistics current slot Version updating situation.
6. a kind of applying upper and lower line number device according to statistics, which is characterized in that including:
Data crawl module, for being accessed using crawler technology to having the application address in address date table;
Applied statistics module, the inquiry state for obtaining server return, according to the inquiry statistic current slot Application offline in described address tables of data is deleted in interior online application and offline application.
7. device according to claim 6, which is characterized in that the applied statistics module is specifically used for:
If the state of inquiry is to access to fail, the application address for accessing failure is put into newly-built wrong data table;If inquiry State is to redirect, then is put into the web page address after redirecting in newly-built wrong data table;
After having traversed described address tables of data, newly-built wrong data table is traversed, during traversing wrong data table, for Inquiry state is to continue to establish the application that new wrong data table storage accesses failure or redirects the case where accessing failure or redirect Address, until reaching preset condition, if there is the application address for accessing failure, then it is assumed that application is offline, and offline is answered It is moved into offline tables of data with address.
8. device according to claim 7, which is characterized in that the preset condition be traversal number reach frequency threshold value or Traversal time reaches time threshold.
9. device according to claim 6, which is characterized in that the applied statistics module is additionally operable to:If the inquiry shape State is to redirect, and the information of application can be crawled by the web page address after redirecting, then is added to the destination address redirected In described address tables of data.
10. device according to claim 6, which is characterized in that the applied statistics module is additionally operable to:If the inquiry shape State is successfully, then to parse the message that the server returns, and judges whether the version of application updates according to message content, statistics is worked as The version updating situation applied in the preceding period.
CN201710010785.4A 2017-01-06 2017-01-06 Application up-line and down-line data statistical method and device Expired - Fee Related CN108280094B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710010785.4A CN108280094B (en) 2017-01-06 2017-01-06 Application up-line and down-line data statistical method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710010785.4A CN108280094B (en) 2017-01-06 2017-01-06 Application up-line and down-line data statistical method and device

Publications (2)

Publication Number Publication Date
CN108280094A true CN108280094A (en) 2018-07-13
CN108280094B CN108280094B (en) 2022-06-17

Family

ID=62800985

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710010785.4A Expired - Fee Related CN108280094B (en) 2017-01-06 2017-01-06 Application up-line and down-line data statistical method and device

Country Status (1)

Country Link
CN (1) CN108280094B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109325050A (en) * 2018-08-01 2019-02-12 吉林盘古网络科技股份有限公司 Data query method, apparatus and terminal device
CN111046316A (en) * 2019-12-16 2020-04-21 北京智游网安科技有限公司 Application on-shelf state monitoring method, intelligent terminal and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110153631A1 (en) * 2009-12-23 2011-06-23 Kondasani Thakur B Methods and systems for detecting broken links within a file
CN105528416A (en) * 2015-12-07 2016-04-27 中南大学 Method and system for monitoring update contents of website
CN105719162A (en) * 2016-01-20 2016-06-29 北京京东尚科信息技术有限公司 Method and device of monitoring validity of promotion links
CN106230809A (en) * 2016-07-27 2016-12-14 南京快页数码科技有限公司 A kind of mobile Internet public sentiment monitoring method based on URL and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110153631A1 (en) * 2009-12-23 2011-06-23 Kondasani Thakur B Methods and systems for detecting broken links within a file
CN105528416A (en) * 2015-12-07 2016-04-27 中南大学 Method and system for monitoring update contents of website
CN105719162A (en) * 2016-01-20 2016-06-29 北京京东尚科信息技术有限公司 Method and device of monitoring validity of promotion links
CN106230809A (en) * 2016-07-27 2016-12-14 南京快页数码科技有限公司 A kind of mobile Internet public sentiment monitoring method based on URL and system

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109325050A (en) * 2018-08-01 2019-02-12 吉林盘古网络科技股份有限公司 Data query method, apparatus and terminal device
CN111046316A (en) * 2019-12-16 2020-04-21 北京智游网安科技有限公司 Application on-shelf state monitoring method, intelligent terminal and storage medium
CN111046316B (en) * 2019-12-16 2023-03-21 北京智游网安科技有限公司 Application on-shelf state monitoring method, intelligent terminal and storage medium

Also Published As

Publication number Publication date
CN108280094B (en) 2022-06-17

Similar Documents

Publication Publication Date Title
DE69931157T2 (en) METHOD AND DEVICE FOR SEPARATING BROWSER FUNCTIONALITY BETWEEN A WIRELESS CLIENT AND A PART OF THE INFRASTRUCTURE IN A WIRELESS COMMUNICATION SYSTEM
CN107832428A (en) Webpage method for monitoring state and system based on Website page
CN104252530B (en) A kind of unit crawler capturing method and system
BR0314366A (en) Method and system for providing routing information for establishing connections in the communication system, mobile terminal, routing server, and, computer program
CN101610268B (en) Implementation method and equipment of keyword filtration
DE60228333D1 (en) ENABLING AN CONTENT DELIVERED BY CONTENTS BY A SPECIFIC RADIO ACCESS NETWORK
CN107040863A (en) Real time business recommends method and system
CN108280094A (en) Using upper and lower line number method and device according to statistics
CA2605849A1 (en) Wireless data device performance monitor
CN109302437A (en) A kind of method and apparatus redirecting website
CN102904765A (en) Method and equipment for data reporting
CN100499590C (en) Message access controlling method and a network apparatus
CN105320758A (en) Search service platform and search service method therefor
EP1531641A3 (en) A server apparatus
CN101800712A (en) Gateway apparatus, information communication method, information communication program, and information communication system
ATE311062T1 (en) METHOD FOR PROVIDING A PROXY SERVER BASED SERVICE FOR A COMMUNICATIONS DEVICE IN A NETWORK
CN102281302A (en) resource access processing method and system
CN103164213A (en) Method, device and system of testing compatibility of Web browser
CN102681996A (en) Pre-reading method and device
CN101877721A (en) Terminal downloading automatic adaptation method and downloading server
CN105721632A (en) Wireless access method and wireless access device based on DNS (Domain Name System) mechanism
CN103181140B (en) Identify the method for service request type, media server and terminal unit
US20070016433A1 (en) Method and apparatus for ranking support materials for service agents and customers
CN101123559A (en) A green network access service deployment system and authorized access method for this service
CN106790635A (en) Cookie information management method and server

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20220617

CF01 Termination of patent right due to non-payment of annual fee